OSU-NLP-Group / SeeAct

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).
https://osu-nlp-group.github.io/SeeAct/
Other
571 stars 69 forks source link

som branch bounding boxes don't show up for certain websites. #40

Closed mlin12321 closed 1 month ago

mlin12321 commented 2 months ago

Specifically for Spotify, interactive elements do not appear to be labeled with bounding boxes on the som branch.

mlin12321 commented 1 month ago

nvm I think Spotify is literally the only website for which it doesn't work. Ostensibly, something to do with the way they lay out cards.