BAAI-Agents / Cradle

The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.
https://baai-agents.github.io/Cradle/
MIT License
1.89k stars 165 forks source link

Other LLM provider seems not correct #84

Open Skylarking opened 2 months ago

Skylarking commented 2 months ago

Because OpenAI Api was banned in CN, I use Qwen Api provided by Alibaba as my llm provider. But some errors occur in command line. Is the provider not good enough so that it cannot reason correctly? image

WeihaoTan commented 2 months ago

Thanks for reaching out. It seems that the action is not generated correctly. The action should be ["open_feishu_assistant_app()"] instead of ["action(open_feishu_assistant_app())"]. Maybe you need to modify the prompt to emphasize this. It seems that Qwen cannot strictly follow the instructions and output the answer in the correct format, which is a common issue for small models.

BruceWayneZero commented 2 months ago

Because OpenAI Api was banned in CN, I use Qwen Api provided by Alibaba as my llm provider. But some errors occur in command line. Is the provider not good enough so that it cannot reason correctly? image

I am also trying to use qwen api.I simply change the api_key and base_url ,but it seems doesn't work.Would you like to tell me what you did to adapt to qwen api?

KJinze commented 2 months ago

Because OpenAI Api was banned in CN, I use Qwen Api provided by Alibaba as my llm provider. But some errors occur in command line. Is the provider not good enough so that it cannot reason correctly? image

I am also trying to use qwen api.I simply change the api_key and base_url ,but it seems doesn't work.Would you like to tell me what you did to adapt to qwen api?

I am testing Stardew Valley using the 'qwen-vl-max' model, but after the character comes out of the house, there are no further actions, and the task of clearing the farm has not been completed. It seems that the model's intelligence might be insufficient? (By the way, during the test, 196 queries consumed 170,000 tokens, so the cost could also be a significant issue.)

目前我遇到的问题仅供参考: 1.修改api_key和base_url self.client = OpenAI( api_key= "sk-", # 如果您没有配置环境变量,请在此处用您的API Key进行替换 base_url="https://dashscope.aliyuncs.com/compatible-mode/v1", # 填写DashScope服务的base_url )

  1. 修改embedding接口的参数为字符串列表 texts_batch = [encoding.decode(t) for t in tokens[i: i + self.chunk_size]] response = self.embed_with_retry( input=texts_batch, **self._emb_invocation_params, ) 千问的embedding接口目前仅支持两种输入,分别是字符串、字符串列表,要转成字符串列表 3.修改openai_config.json "emb_model": "text-embedding-v3", "comp_model": "qwen-vl-max", 用到了图片识别需要用qwen-vl-max模型 4.删除res/stardew/skills/skill_lib.json 千问的embedding接口默认返回1024长度会和之前的缓存冲突 5.文件.env设置好窗口名称 IDE_NAME = "Cradle" 我用PyCharm打开Cradle项目(不点开任何文件)窗口名称是“Cradle”所以我设置为“Cradle” 如果设置不正确会报以下错误(我不知道找这个窗口是干嘛的修改只是为了不报错) ide_window = io_env.get_windows_by_name(ide_name)[0] IndexError: list index out of range
Skylarking commented 2 months ago

Because OpenAI Api was banned in CN, I use Qwen Api provided by Alibaba as my llm provider. But some errors occur in command line. Is the provider not good enough so that it cannot reason correctly? image

I am also trying to use qwen api.I simply change the api_key and base_url ,but it seems doesn't work.Would you like to tell me what you did to adapt to qwen api?

Because Qwen api is similar to Openai api, I just copy the cradle/provider/llm/openai.py as qwen.py and modify some codes. And then add qwen.py to llm_factory.py . Note that you need to read qwen docs.

Skylarking commented 2 months ago

Because OpenAI Api was banned in CN, I use Qwen Api provided by Alibaba as my llm provider. But some errors occur in command line. Is the provider not good enough so that it cannot reason correctly? image

I am also trying to use qwen api.I simply change the api_key and base_url ,but it seems doesn't work.Would you like to tell me what you did to adapt to qwen api?

If I have any time, I will share my code on github. Then you can check my code to modify your own.