Open DanielProkhorov opened 5 months ago
SeeClick is currently an exploratory effort, and the base model does not account for complex dragging or scrolling up and down. Further fine-tuning is needed to obtain such capabilities, such as in AITW the model needs to swipe up and down in mobile phones.
So, from the provided paper only various click actions accross different domains have been shown?
What about sliders and swipes? These do need a from (x, y) and to (x, y) coordinates. How to obtain them?