X-PLUG / MobileAgent

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
https://arxiv.org/abs/2406.01014
MIT License
2.3k stars 193 forks source link

Cosidering use previous action and explore phase to improve accuracy #12

Closed zhiyuan8 closed 4 months ago

zhiyuan8 commented 4 months ago

As shown in AppAgent

https://github.com/mnotgod96/AppAgent/blob/main/scripts/task_executor.py#L204-L206

they use last_act in their prompt, which makes it easier to detect if there is a dead loop and we need to find another solution.

Also, they use explore phase to improve their task execution phase.

Could those improve accuracy?

junyangwang0410 commented 4 months ago

As shown in AppAgent

https://github.com/mnotgod96/AppAgent/blob/main/scripts/task_executor.py#L204-L206

they use last_act in their prompt, which makes it easier to detect if there is a dead loop and we need to find another solution.

Also, they use explore phase to improve their task execution phase.

Could those improve accuracy?

Thanks for your suggestion. We have preserved the operation history, which is provided to the Mobile-Agent in each round. The operation history includes every action generated by the Mobile-Agent and the corresponding screenshots. It is stored in the format of a dialogue and has been simplified to enable the Mobile-Agent to recognize dead loops or erroneous operations. In our paper, we have showcased the performance of the Mobile-Agent when confronted with invalid operations.