Open CarlHuangNuc opened 1 month ago
We actually did test on OSWorld (Screenshot-only, pixel-level grounding). However, since this version does not utilize any Desktop data, the performance is not satisfactory enough in live Desktop environments (As you can see from the results on ScreenSpot-Desktop, or having a try on the huggingface demo). We are planning improvements for Desktop UIs.
If you're interested, I would be more than happy to discuss further thoughts and findings on GUI Agents. Feel free to reach out via email.
I hope to have more communication about these directions with Author. my email: huangke1@lenovo.com
Hi Author,
https://os-world.github.io/