Closed arpaiva closed 2 months ago
I had to deal with this error a little while ago. It had to do with detecting “chat” vs “instruct” in the model name. I will follow up with a fix
This also should be fixed by the backend refactor if you try that branch
+1
Hi. I am not sure what is the root cause. Sometime I get the error or runs correctly with the same script different datasets. Also I have tried 2 instruct models, "meta-llama/Meta-Llama-3-8B-Instruct" and "TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ"; and had the issue with llma3 but not mixstral
@isaacbmiller is there any workaround? Thanks @arpaiva did the backend refactor branch fixed the issue?
@isaacbmiller I tried the "backend-refactor" branch and the issue persists.
Traceback (most recent call last): File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/dsp/modules/hf_client.py", line 231, in _generate completions = json_response["choices"]
KeyError: 'choices'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/./src/rag/ReAct.py", line 712, in <module>
main(args)
File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/./src/rag/ReAct.py", line 633, in main
result = react_module(question=question)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/dspy/primitives/program.py", line 26, in __call__
return self.forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/dspy/predict/react.py", line 116, in forward
output = self.react[hop](**args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/dspy/predict/predict.py", line 69, in __call__
return self.forward(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/dspy/predict/predict.py", line 132, in forward
x, C = dsp.generate(template, **config)(x, stage=self.stage)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/dsp/primitives/predict.py", line 120, in do_generate
completions: list[dict[str, Any]] = generator(prompt, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/dsp/modules/hf.py", line 190, in __call__
response = self.request(prompt, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/dsp/modules/lm.py", line 26, in request
return self.basic_request(prompt, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/dsp/modules/hf.py", line 147, in basic_request
response = self._generate(prompt, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/scratch/scratch/rmhijpo/ctgov_rag/.venv/lib/python3.11/site-packages/dsp/modules/hf_client.py", line 240, in _generate
raise Exception("Received invalid JSON response from server")
Exception: Received invalid JSON response from server
@isaacbmiller is this issue related to https://github.com/stanfordnlp/dspy/issues/1002 ?
Update: I tried implementing PR https://github.com/stanfordnlp/dspy/pull/1012 but the issue persists.
@isaacbmiller - Thank you so much for the comments. @JPonsa - On my recent tests, this seems to have been fixed by commit 110a282c from a few days ago as long as I set vLLM to use the OpenAI entrypoint.
@arpaiva I tried dspy 2.4.12 (updated using poetry) and using OpenIA entry point but I got another error https://github.com/stanfordnlp/dspy/issues/1276
given that you had to replace the HFClientVLLM for the OpenIA endpoint is it fully resolved or a work around? @isaacbmiller should we reopen this issue or should I open a new one?
I think that you should still be able to use the HFCLIENTVLLM
so I will reopen the issue and investigate tomorrow. Sorry for delay on looking into this
Sorry. I think it is kind of a user error. I am running this on a server and some messages get written in different files and it is harder to track the messages.
It seems it could be some sort of error in the parsing in the ReAct and fails to leave the loop filling the context window.
Function Response: The study population in clinical trial NCT00001109 is adult patients with HIV infection. # The answer is ready, so I should use the Finish action. # Action: Finish[The study population in clinical trial NCT00001109 is adult patients with HIV infection.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do. # Action: Finish[The task is now complete.] # The task is now complete. # Thought: There is no more work to do
Failed to parse JSON response: {"object":"error","message":"This model's maximum context length is 8192 tokens. However, you requested 9527 tokens (8527 in the messages, 1000 in the completion). Please reduce the length of the messages or completion.","type":"BadRequestError","param":null,"code":400}
@JPonsa - I think that you misunderstood me. I did not have to change HFClientVLLM
or anything on DSPy. vLLM can serve the LLM in several ways, they call them entrypoints. I was simply noting that vLLM should be set to serve the LLM with an OpenAI-type API interface, which one gets via the vLLM OpenAI entrypoint.
Got it! I was already using the open ai entrypoint
MODEL=meta-llama/Meta-Llama-3-8B-Instruct
MODEL_NAME=llama3_8b
PORT=8045
pip install poetry
poetry run python -m vllm.entrypoints.openai.api_server --model $MODEL --trust-remote-code --port $PORT --dtype half --enforce-eager \
--gpu-memory-utilization 0.80 &
@isaacbmiller, please feel free to close this issue. Mine was very likely a user error and arpaiva's is solved. Sorry for the inconvenience
There are a lot of issues open wrt this but no crisp answer of what solves it.
I want to try DSPy using a local LLM served using vLLM. I followed the instructions from https://dspy-docs.vercel.app/docs/deep-dive/language_model_clients/local_models/HFClientVLLM The model was downloaded previously and stored on a local folder and served with:
but running
yields
I tried also calling
model
directly (i.e., without using._generate
and instantiatingdspy.HFClientVLLM
withmodel_type='chat'
. All resulted in the same outcome.On the server side I got:
Lastly, I also tried using the OpenAI API entrypoint:
but that also triggered the same error.
This is running from a cloning the repo with the latest commit of the main branch 55510ee`.