Open tom-doerr opened 1 month ago
Doesn't happen when I switch the model to
astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit
python3 -m vllm.entrypoints.openai.api_server --model astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit --quantization gptq --tensor-parallel-size 1 --port 38242 --gpu-memory-utilization 0.8 --dtype float16
Now I'm getting a BadRequestError
again. Maybe the vllm server just blocked me because I was senden that many bad request errors earlier.
Creating basic bootstrap: 1/9
2%|▊ | 3/128 [00:00<00:04, 25.35it/s]
Creating basic bootstrap: 2/9
2%|▊ | 3/128 [00:00<00:01, 92.33it/s]
Creating basic bootstrap: 3/9
2%|▊ | 3/128 [00:00<00:01, 76.30it/s]
Creating basic bootstrap: 4/9
2%|▊ | 3/128 [00:00<00:01, 94.34it/s]
Creating basic bootstrap: 5/9
2%|▊ | 3/128 [00:00<00:01, 90.12it/s]
Creating basic bootstrap: 6/9
2%|▊ | 3/128 [00:00<00:01, 102.07it/s]
Creating basic bootstrap: 7/9
2%|▊ | 3/128 [00:00<00:01, 76.79it/s]
Creating basic bootstrap: 8/9
2%|▊ | 3/128 [00:00<00:01, 77.57it/s]
Creating basic bootstrap: 9/9
2%|▊ | 3/128 [00:00<00:01, 80.30it/s]
Failed to parse JSON response: {"object":"error","message":"[{'type': 'extra_forbidden', 'loc': ('body', 'do_sample'), 'msg': 'Extra inputs are not permitted', 'input': True, 'url': 'https://errors.pydantic.dev/2.5/v/extra_forbidden'}]","type":"BadRequestError","param":null,"code":400}
Traceback (most recent call last):
File "/home/tom/dspy/dsp/modules/hf_client.py", line 206, in _generate
completions = json_response["choices"]
KeyError: 'choices'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/tom/web_ai/./generate_emails.py", line 175, in <module>
compiled_with_assertions_mailer = teleprompter.compile(emailer, trainset=trainset, num_trials=100, max_bootstrapped_demos=3, max_labeled_demos=5, eval_kwargs=kwargs, requires_permission_to_run=False)
File "/home/tom/dspy/dspy/teleprompt/mipro_optimizer.py", line 461, in compile
instruction_candidates, _ = self._generate_first_N_candidates(
File "/home/tom/dspy/dspy/teleprompt/mipro_optimizer.py", line 249, in _generate_first_N_candidates
self.observations = self._observe_data(devset).replace("Observations:", "").replace("Summary:", "")
File "/home/tom/dspy/dspy/teleprompt/mipro_optimizer.py", line 177, in _observe_data
observation = dspy.Predict(DatasetDescriptor, n=1, temperature=1.0)(examples=(trainset[0:upper_lim].__repr__()))
File "/home/tom/dspy/dspy/predict/predict.py", line 61, in __call__
return self.forward(**kwargs)
File "/home/tom/dspy/dspy/predict/predict.py", line 103, in forward
x, C = dsp.generate(template, **config)(x, stage=self.stage)
File "/home/tom/dspy/dsp/primitives/predict.py", line 112, in do_generate
completions: list[dict[str, Any]] = generator(prompt, **kwargs)
File "/home/tom/dspy/dsp/modules/hf.py", line 190, in __call__
response = self.request(prompt, **kwargs)
File "/home/tom/dspy/dsp/modules/lm.py", line 26, in request
return self.basic_request(prompt, **kwargs)
File "/home/tom/dspy/dsp/modules/hf.py", line 147, in basic_request
response = self._generate(prompt, **kwargs)
File "/home/tom/dspy/dsp/modules/hf_client.py", line 215, in _generate
raise Exception("Received invalid JSON response from server")
Exception: Received invalid JSON response from server
Getting the same "Not found" error again, but only after the MIPRO bootstraping phase.
looks related to https://github.com/stanfordnlp/dspy/issues/1002
same issue, a little different. I use the Qwen2-72B-Instruct-GPTQ-Int4 model. I followed the gsm8k demo, get the same error. completions = json_response["choices"] KeyError: 'choices' but when I change to do the dspy.Predict('sentence -> sentiment') demo, it works fine.
The code im running:
I start the server with:
Output: