yfzhang114 / SliME

✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models
Apache License 2.0
129 stars 7 forks source link

Evaluation on REC task #7

Open YuchenLiu98 opened 1 month ago

YuchenLiu98 commented 1 month ago

Hi, authors

Thanks a lot for your excellent job. I wonder do you have the plan to test the model on referring expression comprehension task? Since it is a more fine-grained task, which may gain more from high-resolution training.

yfzhang114 commented 1 month ago

We are very interested in exploring this task further. If you could provide us with the dataset, ideally formatted similarly to the VQA format used in datasets like MME, we would be glad to assist in testing and evaluating our model on this task.