mnotgod96 / AppAgent

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
https://appagent-official.github.io/
MIT License
4.84k stars 511 forks source link

Questions about the content of the task and the amount of documentation #40

Closed kig1929 closed 6 months ago

kig1929 commented 7 months ago

I am interested in your research and thank you for sharing your project! I have some questions about AppAgent.

  1. Could you share a list of 50 tasks over 10 apps from the paper? I think it would be helpful for others to know the details of this study if the entire process was disclosed with the exception of the demo.
  2. How many documents were generated for each task during the exploration phase? I want to know how much exploration is needed to be used meaningfully.
icoz69 commented 7 months ago

hi, thanks for your interest. we are going to release a more comprehensive benchmark with more apps and tasks in the future. please look forward to it.

kig1929 commented 7 months ago

Could you give me a rough estimate of when that might be?

icoz69 commented 7 months ago

hi, we have updated the tasks we evaluated in : https://github.com/mnotgod96/AppAgent/blob/main/assets/testset.md

kig1929 commented 6 months ago

Thanks for sharing:)