mathllm / MATH-V

MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.
https://mathllm.github.io/mathvision/
MIT License
69 stars 5 forks source link

some pictures are too big #4

Closed qianzhouyi2 closed 2 months ago

qianzhouyi2 commented 2 months ago

The image for question ID 2467 and 2955 are too large compared with other pictures using the standard qwen2-7b vl setting will result in a100-80g out of memory

scikkk commented 2 months ago

Thank you for your feedback! However, there are actually more than 30 images that are larger than (in width/height/file size) the ones for question IDs 2467 and 2955. Specifically, image 2467 is 178 KB, and image 2955 is 256 KB, both of which are within a reasonable size range. Let me know if you have any other concerns! (: