boyugou / llava_uground

Apache License 2.0
15 stars 2 forks source link

whether have a plan test your work on OSworld? #2

Open CarlHuangNuc opened 1 month ago

CarlHuangNuc commented 1 month ago

Hi Author,

 whether have a plan test your work on OSworld?

https://os-world.github.io/

boyugou commented 1 month ago

We actually did test on OSWorld (Screenshot-only, pixel-level grounding). However, since this version does not utilize any Desktop data, the performance is not satisfactory enough in live Desktop environments (As you can see from the results on ScreenSpot-Desktop, or having a try on the huggingface demo). We are planning improvements for Desktop UIs.

If you're interested, I would be more than happy to discuss further thoughts and findings on GUI Agents. Feel free to reach out via email.

CarlHuangNuc commented 1 month ago

I hope to have more communication about these directions with Author. my email: huangke1@lenovo.com