Earth-Intelligence-Lab / vleo-bench

7 stars 0 forks source link

To launch the benchmarking #1

Open Gpoxolcku opened 3 months ago

Gpoxolcku commented 3 months ago

Hi! Awesome work and datasets collection! Is there a way (or plan to release such a script) to launch a model's benchmark evaluation on the full set of data and obtain a comprehensive report on all the metrics?

danielz02 commented 3 months ago

Thanks a lot for your interest! We are working on it :) The end goal would be to support common Hugging Face models. Do you have any model in mind?

Gpoxolcku commented 3 months ago

Thank you for quick answer! Do you know any approximate release time?) Just developing yet another model, interested in metrics to track the progress :)

danielz02 commented 3 months ago

I'm thinking of some time around ICLR, which is early May, but I can definitely adjust the priority if there is a need for evaluating new models. What interface does your model use? Is it a Hugging Face pipeline or a Llava-like interface?

Gpoxolcku commented 3 months ago

That would be very nice of you, thank you! I use Llava-like interface on a local machine

Gpoxolcku commented 2 weeks ago

Hi, is there any success in finalizing the eval scripts? such benchmark would be very helpful for my projects, thanks :)