mnotgod96 / AppAgent

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
https://appagent-official.github.io/
MIT License
4.97k stars 538 forks source link

ask_gpt4v in model.py needs to be refactored #9

Open truebit opened 9 months ago

truebit commented 9 months ago

We should merge the JSON handling code (the content assembly code) from transaction layer to the base model layer, and ask_gpt4v method signature would need to be changed to text, snapshots, conversation_id.

This would make the GPT-4V API endpoint more flexible.