Open Sunmori opened 4 months ago
This happned with me too. I was using openrouter with Meta Llama 7B instruct.
@Sunmori To confirm, are you using the fastchat OpenAi API server that it shows on Qwen's documentation or are you running Qwen on another platform that has Qwen support? The issue may be that their API spec does not use the same outputs when doing chat completions and this causes issues with streaming. If this is the case we can create an LLM provider specifically for Qwen.
This happned with me too. I was using openrouter with Meta Llama 7B instruct.
This is a known issue with some openrouter models. This has nothing to do with AnythingLLM and is a model issue where the model quantization is most likely causing it to give garbage responses and finishing responses early.
@Sunmori To confirm, are you using the fastchat OpenAi API server that it shows on Qwen's documentation or are you running Qwen on another platform that has Qwen support? The issue may be that their API spec does not use the same outputs when doing chat completions and this causes issues with streaming. If this is the case we can create an LLM provider specifically for Qwen.
I'm using the Openai API rule-compatible URLs and APIs given in the official qwen documentation. I also had the same problem with another LLM provider, both compatible with Openai api rules. I'm wondering if there's a bug in the Generic Openai part of AnythingLLM, because I've done Generic Openai configurations on other software and it works fine. Of course, it would be great if an LLM provider could be created specifically for qwen, which is a very well-known LLM provider in China.
qwen official documentation:https://help.aliyun.com/zh/dashscope/developer-reference/compatibility-of-openai-with-dashscope/?spm=a2c4g.11186623.0.0.1d09f400wOtLdc Zhipu AI official documentation:https://open.bigmodel.cn/dev/api#thirdparty_frame Screenshots of other software
What is the LLM provider you are using with the generic OpenAI provider? Usually, the issue when using the generic is some providers will have different stop tokens and that will halt replies earlier.
If there is a way to us to signup and access whatever provider you are using we can easily add it as a connector and eliminate this issue
What is the LLM provider you are using with the generic OpenAI provider? Usually, the issue when using the generic is some providers will have different stop tokens and that will halt replies earlier.
If there is a way to us to signup and access whatever provider you are using we can easily add it as a connector and eliminate this issue
I'm using the Qwen API released by Aliyun, but it seems like it's only open for Chinese use at the moment. I can give you my API if you want to add connectors.
How are you running AnythingLLM?
AnythingLLM desktop app
What happened?
I chose Generic Openai settings in the LLM Provider settings and filled in Qwen's API .When chatting, the dialog box stops when a word appears.I looked at Qwen's documentation and it shows compatibility with the Openai SDK.Is anyone else able to use the Qwen API properly?I'm begging for your help.
Are there known steps to reproduce?
No response