njucckevin / SeeClick

The model, data and code for the visual GUI Agent SeeClick
Apache License 2.0
139 stars 8 forks source link

How to handle sliders and swipes? #5

Open DanielProkhorov opened 5 months ago

DanielProkhorov commented 5 months ago

So, from the provided paper only various click actions accross different domains have been shown?

What about sliders and swipes? These do need a from (x, y) and to (x, y) coordinates. How to obtain them?

njucckevin commented 5 months ago

SeeClick is currently an exploratory effort, and the base model does not account for complex dragging or scrolling up and down. Further fine-tuning is needed to obtain such capabilities, such as in AITW the model needs to swipe up and down in mobile phones.