[BUG]: I need help,When requesting/api/v1/openai/chat/completions for anything-llm 1.6.0 desktop version, the max_token does not take effect, and the connected model is a local model(xinference)

Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.

https://anythingllm.com

MIT License

27.66k stars 2.8k forks source link

[BUG]: I need help,When requesting/api/v1/openai/chat/completions for anything-llm 1.6.0 desktop version, the max_token does not take effect, and the connected model is a local model(xinference) #2023

Open c935289832 opened 3 months ago

c935289832 commented 3 months ago

How are you running AnythingLLM?

AnythingLLM desktop app

What happened?

When requesting/api/v1/openai/chat/completions for anything-llm 1.6.0 desktop version, the max_token does not take effect, and the connected model is a local model(xinference)

Are there known steps to reproduce?

this is xinference: this is anything-llm desktop: 1722561266193

timothycarambat commented 3 months ago

What LLM provider is this?

c935289832 commented 3 months ago

What LLM provider is this?

xinference+qwen2-instruct

timothycarambat commented 3 months ago

@c935289832 The provider, (ollama, LMStudio, etc) not the model specifically as that is independent of the model