Closed mneedham closed 10 months ago
had the same issue. I think ollama does not return the prompt_eval_count if you rerun it on exact same query (internal caching) : https://github.com/ollama/ollama/issues/2068
Ah interesting. So this is an issue in Ollama? Anyway adding caching in DSPy will probably resolve that, since it will avoid going to Ollama twice. Anyone up for it? There's an example for caching TGI and VLLM in the hf_clients.py file
Facing the same issue. @okhat I can take this up sent #309 for the same class previously.
In short, if we do not have prompt_eval_count
, will use previous / last available prompt_eval_count
. Let me know if it sounds good. For this I don't think we need caching, can do away with having a private variable that stores prev / last value self._prev_prompt_eval_count
.
@okhat sent a PR, please review.
@roolio @mneedham What are you guys testing ollama<>dspy for. Happy to discuss more.
@okhat Little offtopic Q.
If ollama is being server through an API how are we going to optimize model weights? Are there any eg. notebooks that optimize model weights / parameters?
@INF800 Model weights are only updated for local models that support finetuning, see BootstrapFinetune
@INF800 sorry I didn't see your reply until just now. I'm just playing around with stuff on my machine and I liked the approach that DSPy takes where you do stuff through code rather than constructing complicated prompts yourself!
I'm going through the same getting started notebook with Ollama and am running into a very similar error this time with eval_count
.
In [34]: compiled_rag_retrieval_score = evaluate_on_hotpotqa(compiled_rag, metric=gold_passages_retrieved)
0%| | 0/50 [00:00<?, ?it/s]
Average Metric: 0 / 1 (0.0): 0%| | 0/50 [00:24<?, ?it/s]
Average Metric: 0 / 1 (0.0): 2%|▏ | 1/50 [00:24<20:06, 24.63s/it]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[34], line 1
----> 1 compiled_rag_retrieval_score = evaluate_on_hotpotqa(compiled_rag, metric=gold_passages_retrieved)
File /opt/conda/envs/py310/lib/python3.10/site-packages/dspy/evaluate/evaluate.py:163, in Evaluate.__call__(self, program, metric, devset, num_threads, display_progress, display_table, display, return_all_scores, return_outputs)
160 reordered_devset, ncorrect, ntotal = self._execute_single_thread(
161 wrapped_program, devset, display_progress)
162 else:
--> 163 reordered_devset, ncorrect, ntotal = self._execute_multi_thread(
164 wrapped_program,
165 devset,
166 num_threads,
167 display_progress,
168 )
169 if return_outputs: # Handle the return_outputs logic
170 results = [(example, prediction, score)
171 for _, example, prediction, score in reordered_devset]
File /opt/conda/envs/py310/lib/python3.10/site-packages/dspy/evaluate/evaluate.py:82, in Evaluate._execute_multi_thread(self, wrapped_program, devset, num_threads, display_progress)
78 pbar = tqdm.tqdm(total=len(devset), dynamic_ncols=True,
79 disable=not display_progress)
81 for future in as_completed(futures):
---> 82 example_idx, example, prediction, score = future.result()
83 reordered_devset.append(
84 (example_idx, example, prediction, score))
85 ncorrect += score
File /opt/conda/envs/py310/lib/python3.10/concurrent/futures/_base.py:451, in Future.result(self, timeout)
449 raise CancelledError()
450 elif self._state == FINISHED:
--> 451 return self.__get_result()
453 self._condition.wait(timeout)
455 if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]:
File /opt/conda/envs/py310/lib/python3.10/concurrent/futures/_base.py:403, in Future.__get_result(self)
401 if self._exception:
402 try:
--> 403 raise self._exception
404 finally:
405 # Break a reference cycle with the exception in self._exception
406 self = None
File /opt/conda/envs/py310/lib/python3.10/concurrent/futures/thread.py:58, in _WorkItem.run(self)
55 return
57 try:
---> 58 result = self.fn(*self.args, **self.kwargs)
59 except BaseException as exc:
60 self.future.set_exception(exc)
File /opt/conda/envs/py310/lib/python3.10/site-packages/dspy/evaluate/evaluate.py:150, in Evaluate.__call__.<locals>.wrapped_program(example_idx, example)
148 current_error_count = self.error_count
149 if current_error_count >= self.max_errors:
--> 150 raise e
151 print(f"Error for example in dev set: \t\t {e}")
152 return example_idx, example, dict(), 0.0
File /opt/conda/envs/py310/lib/python3.10/site-packages/dspy/evaluate/evaluate.py:132, in Evaluate.__call__.<locals>.wrapped_program(example_idx, example)
127 # print(threading.get_ident(), dsp.settings.stack_by_thread[threading.get_ident()])
128
129 # print(type(example), example)
131 try:
--> 132 prediction = program(**example.inputs())
133 score = metric(
134 example,
135 prediction,
136 ) # FIXME: TODO: What's the right order? Maybe force name-based kwargs!
138 # increment assert and suggest failures to program's attributes
File /opt/conda/envs/py310/lib/python3.10/site-packages/dspy/primitives/program.py:26, in Module.__call__(self, *args, **kwargs)
25 def __call__(self, *args, **kwargs):
---> 26 return self.forward(*args, **kwargs)
Cell In[19], line 12, in RAG.forward(self, question)
10 def forward(self, question):
11 context = self.retrieve(question).passages
---> 12 prediction = self.generate_answer(context=context, question=question)
13 return dspy.Prediction(context=context, answer=prediction.answer)
File /opt/conda/envs/py310/lib/python3.10/site-packages/dspy/predict/predict.py:49, in Predict.__call__(self, **kwargs)
48 def __call__(self, **kwargs):
---> 49 return self.forward(**kwargs)
File /opt/conda/envs/py310/lib/python3.10/site-packages/dspy/predict/chain_of_thought.py:59, in ChainOfThought.forward(self, **kwargs)
57 signature = new_signature
58 # template = dsp.Template(self.signature.instructions, **new_signature)
---> 59 return super().forward(signature=signature, **kwargs)
File /opt/conda/envs/py310/lib/python3.10/site-packages/dspy/predict/predict.py:91, in Predict.forward(self, **kwargs)
88 template = signature_to_template(signature)
90 if self.lm is None:
---> 91 x, C = dsp.generate(template, **config)(x, stage=self.stage)
92 else:
93 # Note: query_only=True means the instructions and examples are not included.
94 # I'm not really sure why we'd want to do that, but it's there.
95 with dsp.settings.context(lm=self.lm, query_only=True):
File /opt/conda/envs/py310/lib/python3.10/site-packages/dsp/primitives/predict.py:120, in _generate.<locals>.do_generate(example, stage, max_depth, original_example)
112 new_kwargs = {
113 **kwargs,
114 max_tokens_key: max_tokens,
115 "n": 1,
116 "temperature": 0.0,
117 }
119 assert max_depth > 0
--> 120 return generate(template, **new_kwargs)(
121 completion,
122 stage=stage,
123 max_depth=max_depth - 1,
124 original_example=original_example,
125 )
127 completions = Completions(completions, template=template)
128 example = example.copy(completions=completions)
File /opt/conda/envs/py310/lib/python3.10/site-packages/dsp/primitives/predict.py:77, in _generate.<locals>.do_generate(example, stage, max_depth, original_example)
75 # Generate and extract the fields.
76 prompt = template(example)
---> 77 completions: list[dict[str, Any]] = generator(prompt, **kwargs)
78 completions: list[Example] = [template.extract(example, p) for p in completions]
80 # Find the completions that are most complete.
File /opt/conda/envs/py310/lib/python3.10/site-packages/dsp/modules/ollama.py:171, in OllamaLocal.__call__(self, prompt, only_completed, return_sorted, **kwargs)
168 assert only_completed, "for now"
169 assert return_sorted is False, "for now"
--> 171 response = self.request(prompt, **kwargs)
173 choices = response["choices"]
175 completed_choices = [c for c in choices if c["finish_reason"] != "length"]
File /opt/conda/envs/py310/lib/python3.10/site-packages/dsp/modules/ollama.py:145, in OllamaLocal.request(self, prompt, **kwargs)
142 if "model_type" in kwargs:
143 del kwargs["model_type"]
--> 145 return self.basic_request(prompt, **kwargs)
File /opt/conda/envs/py310/lib/python3.10/site-packages/dsp/modules/ollama.py:121, in OllamaLocal.basic_request(self, prompt, **kwargs)
106 text = (
107 response_json.get("message").get("content")
108 if self.model_type == "chat"
109 else response_json.get("response")
110 )
111 request_info["choices"].append(
112 {
113 "index": i,
(...)
119 },
120 )
--> 121 tot_eval_tokens += response_json.get("eval_count")
122 request_info["additional_kwargs"] = {k: v for k, v in response_json.items() if k not in ["response"]}
124 request_info["usage"] = {
125 "prompt_tokens": response_json.get("prompt_eval_count", self._prev_prompt_eval_count),
126 "completion_tokens": tot_eval_tokens,
127 "total_tokens": response_json.get("prompt_eval_count", self._prev_prompt_eval_count) + tot_eval_tokens,
128 }
TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'
Seems like storing an additional private variable self._prev_eval_count
in a similar to self._prev_prompt_eval_count
could fix this.
Exactly the same issue here with eval_count
!!
I'm having the same problem with eval_count
TypeError Traceback (most recent call last)
Cell In[51], line 3
1 with dspy.context(max_tokens=8000, temperature=0):
2 optimized_cot_bfs = BootstrapFewShot(metric=f84_metric, max_bootstrapped_demos=4, max_labeled_demos=6
----> 3 ).compile(CoT(), trainset=tset, valset=devset)
4 print(optimized_cot_bfs(**example.inputs()))
5 backend_llm.inspect_history(1)
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dspy/teleprompt/bootstrap.py:52](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dspy/teleprompt/bootstrap.py#line=51), in BootstrapFewShot.compile(self, student, teacher, trainset, valset)
50 self._prepare_student_and_teacher(student, teacher)
51 self._prepare_predictor_mappings()
---> 52 self._bootstrap()
54 self.student = self._train()
55 self.student._compiled = True
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dspy/teleprompt/bootstrap.py:109](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dspy/teleprompt/bootstrap.py#line=108), in BootstrapFewShot._bootstrap(self, max_bootstraps)
106 break
108 if example_idx not in bootstrapped:
--> 109 success = self._bootstrap_one_example(example, round_idx)
111 if success:
112 bootstrapped[example_idx] = True
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dspy/teleprompt/bootstrap.py:164](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dspy/teleprompt/bootstrap.py#line=163), in BootstrapFewShot._bootstrap_one_example(self, example, round_idx)
162 current_error_count = self.error_count
163 if current_error_count >= self.max_errors:
--> 164 raise e
165 print(f'Failed to run or to evaluate example {example} with {self.metric} due to {e}.')
167 if success:
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dspy/teleprompt/bootstrap.py:143](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dspy/teleprompt/bootstrap.py#line=142), in BootstrapFewShot._bootstrap_one_example(self, example, round_idx)
140 predictor_cache[name] = predictor.demos
141 predictor.demos = [x for x in predictor.demos if x != example]
--> 143 prediction = teacher(**example.inputs())
144 trace = dsp.settings.trace
146 for name, predictor in teacher.named_predictors():
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dspy/primitives/program.py:26](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dspy/primitives/program.py#line=25), in Module.__call__(self, *args, **kwargs)
25 def __call__(self, *args, **kwargs):
---> 26 return self.forward(*args, **kwargs)
Cell In[50], line 26, in CoT.forward(self, *args, **kwargs)
25 def forward(self, *args, **kwargs):
---> 26 answer = self.prog(*args, **kwargs)
27 return answer
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dspy/primitives/program.py:26](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dspy/primitives/program.py#line=25), in Module.__call__(self, *args, **kwargs)
25 def __call__(self, *args, **kwargs):
---> 26 return self.forward(*args, **kwargs)
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dspy/functional/functional.py:180](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dspy/functional/functional.py#line=179), in TypedPredictor.forward(self, **kwargs)
178 signature = self._prepare_signature()
179 for try_i in range(self.max_retries):
--> 180 result = self.predictor(**modified_kwargs, new_signature=signature)
181 errors = {}
182 parsed_results = []
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dspy/predict/predict.py:49](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dspy/predict/predict.py#line=48), in Predict.__call__(self, **kwargs)
48 def __call__(self, **kwargs):
---> 49 return self.forward(**kwargs)
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dspy/predict/predict.py:91](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dspy/predict/predict.py#line=90), in Predict.forward(self, **kwargs)
88 template = signature_to_template(signature)
90 if self.lm is None:
---> 91 x, C = dsp.generate(template, **config)(x, stage=self.stage)
92 else:
93 # Note: query_only=True means the instructions and examples are not included.
94 # I'm not really sure why we'd want to do that, but it's there.
95 with dsp.settings.context(lm=self.lm, query_only=True):
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dsp/primitives/predict.py:77](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dsp/primitives/predict.py#line=76), in _generate.<locals>.do_generate(example, stage, max_depth, original_example)
75 # Generate and extract the fields.
76 prompt = template(example)
---> 77 completions: list[dict[str, Any]] = generator(prompt, **kwargs)
78 completions: list[Example] = [template.extract(example, p) for p in completions]
80 # Find the completions that are most complete.
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dsp/modules/ollama.py:171](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dsp/modules/ollama.py#line=170), in OllamaLocal.__call__(self, prompt, only_completed, return_sorted, **kwargs)
168 assert only_completed, "for now"
169 assert return_sorted is False, "for now"
--> 171 response = self.request(prompt, **kwargs)
173 choices = response["choices"]
175 completed_choices = [c for c in choices if c["finish_reason"] != "length"]
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dsp/modules/ollama.py:145](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dsp/modules/ollama.py#line=144), in OllamaLocal.request(self, prompt, **kwargs)
142 if "model_type" in kwargs:
143 del kwargs["model_type"]
--> 145 return self.basic_request(prompt, **kwargs)
File [~/venvs/jupyter-env/lib/python3.12/site-packages/dsp/modules/ollama.py:121](http://ws02.nobiastx.internal:8010/home/danielhilst/venvs/jupyter-env/lib/python3.12/site-packages/dsp/modules/ollama.py#line=120), in OllamaLocal.basic_request(self, prompt, **kwargs)
106 text = (
107 response_json.get("message").get("content")
108 if self.model_type == "chat"
109 else response_json.get("response")
110 )
111 request_info["choices"].append(
112 {
113 "index": i,
(...)
119 },
120 )
--> 121 tot_eval_tokens += response_json.get("eval_count")
122 request_info["additional_kwargs"] = {k: v for k, v in response_json.items() if k not in ["response"]}
124 request_info["usage"] = {
125 "prompt_tokens": response_json.get("prompt_eval_count", self._prev_prompt_eval_count),
126 "completion_tokens": tot_eval_tokens,
127 "total_tokens": response_json.get("prompt_eval_count", self._prev_prompt_eval_count) + tot_eval_tokens,
128 }
TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'
Similarly having issues with eval_count
as everyone above, ran into it when using the signature optimizer with depth=breadth=4
same problem here.
due to unsupported operand type(s) for +=: 'int' and 'NoneType'. [dspy.teleprompt.bootstrap] filename=bootstrap.py lineno=211
I am facing the same issue in line 136 of ollama.py. I just opened a pull request according to the approach provided and used by @INF800 (https://github.com/stanfordnlp/dspy/issues/293#issuecomment-1921965485 , https://github.com/stanfordnlp/dspy/pull/325 )
Stack Trace:
File "C:\someproject.venv\Lib\site-packages\dspy\primitives\assertions.py", line 294, in forward return wrapped_forward(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\someproject.venv\Lib\site-packages\dspy\primitives\assertions.py", line 220, in wrapper result = func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "c:\someproject\src\scripts\extract_information.py", line 140, in forward result = self.prog(input=input, question=self.question) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\someproject.venv\Lib\site-packages\dspy\primitives\program.py", line 26, in call return self.forward(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\someproject.venv\Lib\site-packages\dspy\functional\functional.py", line 295, in forward result = self.predictor(modified_kwargs, new_signature=signature) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\someproject.venv\Lib\site-packages\dspy\predict\retry.py", line 64, in call pred = self.module(kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "C:\someproject.venv\Lib\site-packages\dspy\predict\predict.py", line 78, in call return self.forward(kwargs) ^^^^^^^^^^^^^^^^^^^^^^ File "C:\someproject.venv\Lib\site-packages\dspy\predict\predict.py", line 116, in forward completions = old_generate(demos, signature, kwargs, config, self.lm, self.stage) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\someproject.venv\Lib\site-packages\dspy\predict\predict.py", line 143, in old_generate x, C = dsp.generate(template, config)(x, stage=stage) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\someproject.venv\Lib\site-packages\dsp\primitives\predict.py", line 73, in do_generate completions: list[dict[str, Any]] = generator(prompt, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\someproject.venv\Lib\site-packages\dsp\modules\ollama.py", line 186, in call response = self.request(prompt, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\someproject.venv\Lib\site-packages\dsp\modules\ollama.py", line 160, in request return self.basic_request(prompt, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\someproject.venv\Lib\site-packages\dsp\modules\ollama.py", line 136, in basic_request tot_eval_tokens += response_json.get("eval_count") TypeError: unsupported operand type(s) for +=: 'int' and 'NoneType'**
Hey,
When I try the Getting Started Notebook with Ollama, the first time I run one of the examples it works fine. But the next time it's maybe cached and throws an error around token use?
First time:
Try again: