Comparison with AutoDroid (a similar tool)?

Hello, good work!

I'm one of the authors of AutoDroid, an LLM-based Android task automation approach released several months ago before AppAgent. We did not advertise our work, so it didn't get so much attention as yours. However, I'm glad to see that people are excited about this direction. Thanks for your contribution to the community :)

I noticed that the exploration-augmented method of AppAgent is quite similar to AutoDroid, while AutoDroid is purely based on text (using Vicuna, GPT-3.5, and GPT-4) and AppAgent is based on GPT-4V. I'm curious about the benefits and challenges of using multi-modality models on such UI automation tasks. Have you tested it? Or can you briefly comment on this?

Since I didn't find any comparison with AutoDroid in your paper/github, maybe we can have some open discussion here.

AutoDroid website: https://autodroid-sys.github.io/ AutoDroid paper: https://arxiv.org/abs/2308.15272 AutoDroid code: https://github.com/MobileLLM/AutoDroid

Best, Yuanchun

mnotgod96 / AppAgent

Comparison with AutoDroid (a similar tool)? #8