OSU-NLP-Group / SeeAct

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).
https://osu-nlp-group.github.io/SeeAct/
Other
571 stars 69 forks source link

Some interactable elements don't show up for element selection. #45

Open mlin12321 opened 1 month ago

mlin12321 commented 1 month ago

Title. For example, on Youtube the option to adjust playback speed on a video doesn't show up after clicking video settings.