cooelf / Auto-GUI

Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)
https://arxiv.org/abs/2309.11436
Apache License 2.0
174 stars 15 forks source link

Incompatible action space with AitW #4

Closed Jiayi-Pan closed 11 months ago

Jiayi-Pan commented 11 months ago

Dear authors, thank you for this great work! I wonder if the task_Impossible action defined in the original AitW paper is also applicable in this paper?

Jiayi-Pan commented 11 months ago

And if you've removed these task_impossible trajectories, could you clarify how did you evaluate the models on AitW? Thanks :)

cooelf commented 11 months ago

Hi, our action space is the same as AITW (see https://github.com/cooelf/Auto-UI/blob/main/action_type.py). We did not remove any trajectory. After checking the model output, we could see 'action_type': '11', which means 'task_impossible'.

Did the description from the paper cause your misunderstanding? We would make the paper more clear :)

Jiayi-Pan commented 11 months ago

That's great and thank you for the clarification. It would be great if you can refine the statement at page 5 We consider six action types: dual-point gesture, type, go_back, go_home, enter, and status_complete. :)