Evaluation for more datasets

TRI-ML / vlm-evaluation

VLM Evaluation: Benchmark for VLMs, spanning text generation tasks from VQA to Captioning

Other

64 stars 7 forks source link

Evaluation for more datasets #7

Closed Lauch1ng closed 2 months ago

Lauch1ng commented 3 months ago

Hello, will you support evaluation for more benchmarks (e.g., MME)? Thanks a lot!

ashwin-balakrishna96 commented 2 months ago

We are definitely working on adding more evaluation benchmarks, and MME would definitely be a great one. If you would be interested in adding support for any more benchmarks and opening up a PR, we would also be happy to review and add it in any evaluations. Thanks!