X-PLUG / MobileAgent

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
https://arxiv.org/abs/2406.01014
MIT License
2.3k stars 193 forks source link
agent android app automation copilot gpt4v gui harmony ios mllm mobile mobile-agents multimodal multimodal-agent multimodal-large-language-models

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

Open in Spaces Demo ModelScope

MobileAgent | Trendshift

English | 简体中文

📺Demo

Mobile-Agent-v2

https://github.com/X-PLUG/MobileAgent/assets/127390760/d907795d-b5b9-48bf-b1db-70cf3f45d155

Mobile-Agent

https://github.com/X-PLUG/MobileAgent/assets/127390760/26c48fb0-67ed-4df6-97b2-aa0c18386d31

📢News

📱Version

⭐Star History

Star History Chart

📑Citation

If you find Mobile-Agent useful for your research and applications, please cite using this BibTeX:

@article{wang2024mobile2,
  title={Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration},
  author={Wang, Junyang and Xu, Haiyang and Jia, Haitao and Zhang, Xi and Yan, Ming and Shen, Weizhou and Zhang, Ji and Huang, Fei and Sang, Jitao},
  journal={arXiv preprint arXiv:2406.01014},
  year={2024}
}

@article{wang2024mobile,
  title={Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception},
  author={Wang, Junyang and Xu, Haiyang and Ye, Jiabo and Yan, Ming and Shen, Weizhou and Zhang, Ji and Huang, Fei and Sang, Jitao},
  journal={arXiv preprint arXiv:2401.16158},
  year={2024}
}

📦Related Projects