roboflow / gpt-checkup

Monitor the performance of OpenAI's GPT-4V model over time.
https://www.gptcheckup.com
27 stars 5 forks source link

Planning to add more models in the testing suite? #20

Open AyushExel opened 2 months ago

AyushExel commented 2 months ago

Hey great project. Are you guys planning to add more models?

AyushExel commented 2 months ago

Also packaging this project would allow me to easily import parts of it for my project :)

josephofiowa commented 1 month ago

@AyushExel We've considered extending this eval to models like these: WhatsApp Image 2024-07-05 at 5 11 19 PM

Are there specific models you'd like to also see evals for?

noted re: packaging request :) A bit out-of-scope of current work for this project, though I could see other places we can make that simpler.