JHU-CLSP / turking-bench

Web-grounded natural language instructions
https://turkingbench.github.io
Apache License 2.0
13 stars 6 forks source link

GPT4TextVision Baseline Works and Beginning of General Mouse/Keyboard controls on Vision Models Skeleton #126

Closed klxu03 closed 9 months ago

klxu03 commented 9 months ago

@danyaljj

klxu03 commented 9 months ago

Merge when you're happy.

One thing that to consider, if it's not difficult: We have a file actions actions.py. Some of the actions that you have in vision.py can live inside actions.py. Just a suggestion; no need to do it now. We can also get back to this in future.

This is true. I kept them separate since I wanted to build out a new evaluation system with vision.py and it'd make it easier for me to have it separate, but to quickly make the gpt4-text-vision benchmark I just re-used the code I wrote over there