stanford-oval / WikiChat

WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus.
https://wikichat.genie.stanford.edu
Apache License 2.0
974 stars 94 forks source link

422 Unprocessable Entity #32

Open zhangapeng opened 2 weeks ago

zhangapeng commented 2 weeks ago

I get a “422 Unprocessable Entity” when calling a local LLM service and I don't know what's causing it。 image

s-jse commented 2 weeks ago

Hi,

Can you please let us know what server and model you are using? E.g. LLaMA-3 on text-generation-inference etc. And what command you are using to run WikiChat?

zhangapeng commented 2 weeks ago

Hi,

Can you please let us know what server and model you are using? E.g. LLaMA-3 on text-generation-inference etc. And what command you are using to run WikiChat?

I use the api deployment code from the chatglm3 repository, which is compatible with Openai-api. I use "inv demo --engine local" to run wikichat。The error message on the terminal is as follows: image In addition, I successfully called the local LLM service in the litellm library alone

s-jse commented 2 weeks ago

One thing to check is which port you are serving chatglm3 from. By default, WikiChat expects local models to be served from port 5002. See https://github.com/stanford-oval/WikiChat/blob/main/llm_config.yaml#L99-L103 on how to change that if needed.

If that doesn't help, you can enable LiteLLM's verbose logging (https://github.com/stanford-oval/WikiChat/blob/main/llm_config.yaml#L17) and paste the full log here, to help us with troubleshooting.

zhangapeng commented 1 week ago

One thing to check is which port you are serving chatglm3 from. By default, WikiChat expects local models to be served from port 5002. See https://github.com/stanford-oval/WikiChat/blob/main/llm_config.yaml#L99-L103 on how to change that if needed.

If that doesn't help, you can enable LiteLLM's verbose logging (https://github.com/stanford-oval/WikiChat/blob/main/llm_config.yaml#L17) and paste the full log here, to help us with troubleshooting.

I use vllm to deploy a local LLM. How should I modify the ”local: huggingface/local" field in the llm_config.yaml file? I tried to change it to the name set when vllm was deployed, but it reported an error that the model does not exist. If I don't modify it, huggingface will report an error。 image

s-jse commented 2 days ago

I just tested, and it does not seem to work with vLLM. I will need to look into it more closely. In the meantime, you can use https://github.com/huggingface/text-generation-inference/, which I just tested and works with this code base.