mnotgod96 / AppAgent

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
https://appagent-official.github.io/
MIT License
4.94k stars 536 forks source link

This operation seems expensive? #30

Open chengsluo opened 9 months ago

chengsluo commented 9 months ago

When I test calling someone, a lot of tokens are consumed because the Base64 encoding of the picture needs to be sent. The cost of personal experimentation in this way is too high...

chengsluo commented 9 months ago

”You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.“ I log the ask_gpt4v function input,this input Base64 is so big。used exceeded my 15$ limited, so test run failed。

Is it really that expensive, or is there something wrong with my operation?

icoz69 commented 7 months ago

hi, we have added qwen-vl-max (通义千问-VL) as an alternative multi-modal model. The model is currently free to use.