stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy-docs.vercel.app/
MIT License
17.36k stars 1.33k forks source link

MIPROv2 crashes with AssertionError: No input variables found in the example #1506

Closed NumberChiffre closed 1 week ago

NumberChiffre commented 1 week ago

Description

Hey guys,

I created an extractor using CoT to extract entities and relationships out of text, and tried to run it with MIPROv2 to generate optimized prompt instructions. So far, I am not able to get far. I am also using the recommended dspy.settings.configure(experimental=True) because the program keeps asking me about format_handler so I had to switch to using the experimental feature, hence getting stuck with this new problem. Note, this error happens for v2.4.14 and v2.4.16 so far.

The following logs show that I am able to process the 4 training examples without problems, before hitting a problem after bootstrapping with AssertionError: No input variables found in the example:

Code:

Full code in jupyter notebook and logs: https://github.com/gusye1234/nano-graphrag/blob/be09b227e018ee56faa5b3d0e4f74d0ba7d04f63/examples/finetune_entity_relationship_dspy.ipynb

Logs

WARNING: Projected Language Model (LM) Calls

Please be advised that based on the parameters you have set, the maximum number of LM calls is projected as follows:

- Prompt Model: 10 data summarizer calls + 4 * 2 lm calls in program + (3) lm calls in program aware proposer = 21 prompt model calls
- Task Model: 25 examples in minibatch * 10 batches + 4 examples in train set * 1 full evals = 254 task model calls

Estimated Cost Calculation:

Total Cost = (Number of calls to task model * (Avg Input Token Length per Call * Task Model Price per Input Token + Avg Output Token Length per Call * Task Model Price per Output Token) 
            + (Number of calls to prompt model * (Avg Input Token Length per Call * Task Prompt Price per Input Token + Avg Output Token Length per Call * Prompt Model Price per Output Token).

For a preliminary estimate of potential costs, we recommend you perform your own calculations based on the task
and prompt models you intend to use. If the projected costs exceed your budget or expectations, you may consider:

- Reducing the number of trials (`num_batches`), the size of the trainset, or the number of LM calls in your program.
- Using a cheaper task model to optimize the prompt.
Error getting source code: unhashable type: 'list'.

Running without program aware proposer.
INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
summary: Prediction(
    summary='The dataset comprises news articles covering diverse topics such as political events, economic data, and corporate news, with each example detailing entities and their relationships. This structured content, presented in formal journalistic syntax, is highly relevant to current events and suitable for training models in entity recognition, relationship extraction, and summarization tasks.'
)
DATA SUMMARY: The dataset comprises news articles covering diverse topics such as political events, economic data, and corporate news, with each example detailing entities and their relationships. This structured content, presented in formal journalistic syntax, is highly relevant to current events and suitable for training models in entity recognition, relationship extraction, and summarization tasks.
  0%|          | 0/4 [00:00<?, ?it/s]INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
WARNING:nano-graphrag:Received an empty JSON string
DEBUG:nano-graphrag:Entities: 29 | Missed Entities: 0 | Total Entities: 29
DEBUG:nano-graphrag:Relationships: 19 | Missed Relationships: 7 | Total Relationships: 26
DEBUG:nano-graphrag:Direct Relationships: 26 | Second-order: 0 | Third-order: 0 | Total Relationships: 26
 25%|██▌       | 1/4 [03:32<10:37, 212.60s/it]INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
WARNING:nano-graphrag:Received an empty JSON string
DEBUG:nano-graphrag:Entities: 6 | Missed Entities: 0 | Total Entities: 6
DEBUG:nano-graphrag:Relationships: 5 | Missed Relationships: 0 | Total Relationships: 5
DEBUG:nano-graphrag:Direct Relationships: 5 | Second-order: 0 | Third-order: 0 | Total Relationships: 5
 50%|█████     | 2/4 [04:26<03:58, 119.06s/it]INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
WARNING:nano-graphrag:Received an empty JSON string
DEBUG:nano-graphrag:Entities: 20 | Missed Entities: 0 | Total Entities: 20
DEBUG:nano-graphrag:Relationships: 22 | Missed Relationships: 6 | Total Relationships: 28
DEBUG:nano-graphrag:Direct Relationships: 28 | Second-order: 0 | Third-order: 0 | Total Relationships: 28
 75%|███████▌  | 3/4 [07:26<02:27, 147.10s/it]INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
WARNING:nano-graphrag:Received an empty JSON string
DEBUG:nano-graphrag:Entities: 6 | Missed Entities: 0 | Total Entities: 6
DEBUG:nano-graphrag:Relationships: 6 | Missed Relationships: 2 | Total Relationships: 8
DEBUG:nano-graphrag:Direct Relationships: 7 | Second-order: 1 | Third-order: 0 | Total Relationships: 8
100%|██████████| 4/4 [08:32<00:00, 128.14s/it]
Bootstrapped 4 full traces after 4 examples in round 0.
  0%|          | 0/4 [00:00<?, ?it/s]INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
WARNING:nano-graphrag:Received an empty JSON string
DEBUG:nano-graphrag:Entities: 17 | Missed Entities: 0 | Total Entities: 17
DEBUG:nano-graphrag:Relationships: 12 | Missed Relationships: 8 | Total Relationships: 20
DEBUG:nano-graphrag:Direct Relationships: 17 | Second-order: 3 | Third-order: 0 | Total Relationships: 20
 25%|██▌       | 1/4 [02:32<07:36, 152.01s/it]INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
WARNING:nano-graphrag:Received an empty JSON string
DEBUG:nano-graphrag:Entities: 29 | Missed Entities: 0 | Total Entities: 29
DEBUG:nano-graphrag:Relationships: 18 | Missed Relationships: 10 | Total Relationships: 28
DEBUG:nano-graphrag:Direct Relationships: 28 | Second-order: 0 | Third-order: 0 | Total Relationships: 28
 50%|█████     | 2/4 [06:17<06:30, 195.10s/it]WARNING:nano-graphrag:Received an empty JSON string
DEBUG:nano-graphrag:Entities: 6 | Missed Entities: 0 | Total Entities: 6
DEBUG:nano-graphrag:Relationships: 5 | Missed Relationships: 0 | Total Relationships: 5
DEBUG:nano-graphrag:Direct Relationships: 5 | Second-order: 0 | Third-order: 0 | Total Relationships: 5
INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.deepseek.com/chat/completions "HTTP/1.1 200 OK"
WARNING:nano-graphrag:Received an empty JSON string
DEBUG:nano-graphrag:Entities: 7 | Missed Entities: 0 | Total Entities: 7
DEBUG:nano-graphrag:Relationships: 7 | Missed Relationships: 1 | Total Relationships: 8
DEBUG:nano-graphrag:Direct Relationships: 5 | Second-order: 3 | Third-order: 0 | Total Relationships: 8
100%|██████████| 4/4 [07:18<00:00, 109.53s/it]
Bootstrapped 4 full traces after 4 examples in round 0.
Using a randomly generated configuration for our grounded proposer.
Selected tip: description
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[13], line 10
      1 optimizer = MIPROv2(
      2     prompt_model=lm,
      3     task_model=lm,
   (...)
      7     verbose=True
      8 )
      9 kwargs = dict(num_threads=os.cpu_count(), display_progress=True, display_table=0)
---> 10 miprov2_model = optimizer.compile(
     11     model, 
     12     trainset=trainset[:4], 
     13     valset=valset, 
     14     requires_permission_to_run=False,
     15     num_batches=10, 
     16     max_labeled_demos=5, 
     17     max_bootstrapped_demos=4, 
     18     eval_kwargs=kwargs
     19 )
     20 miprov2_model

File /opt/homebrew/Caskroom/miniconda/base/envs/nano-graphrag/lib/python3.10/site-packages/dspy/teleprompt/mipro_optimizer_v2.py:291, in MIPROv2.compile(self, student, trainset, valset, num_batches, max_bootstrapped_demos, max_labeled_demos, eval_kwargs, seed, minibatch, program_aware_proposer, requires_permission_to_run)
    289 proposer.use_instruct_history = False
    290 proposer.set_history_randomly = False
--> 291 instruction_candidates = proposer.propose_instructions_for_program(
    292     trainset=trainset,
    293     program=program,
    294     demo_candidates=demo_candidates,
    295     N=self.n,
    296     prompt_model=self.prompt_model,
    297     T=self.init_temperature,
    298     trial_logs={},
    299 )
    300 for i, pred in enumerate(program.predictors()):
    301     instruction_candidates[i][0] = get_signature(pred).instructions

File /opt/homebrew/Caskroom/miniconda/base/envs/nano-graphrag/lib/python3.10/site-packages/dspy/propose/grounded_proposer.py:302, in GroundedProposer.propose_instructions_for_program(self, trainset, program, demo_candidates, prompt_model, trial_logs, N, T, tip)
    299         if pred_i not in proposed_instructions:
    300             proposed_instructions[pred_i] = []
    301         proposed_instructions[pred_i].append(
--> 302             self.propose_instruction_for_predictor(
    303                 program=program,
    304                 predictor=predictor,
    305                 pred_i=pred_i,
    306                 prompt_model=prompt_model,
    307                 T=T,
    308                 demo_candidates=demo_candidates,
    309                 demo_set_i=demo_set_i,
    310                 trial_logs=trial_logs,
    311                 tip=selected_tip,
    312             ),
    313         )
    314 return proposed_instructions

File /opt/homebrew/Caskroom/miniconda/base/envs/nano-graphrag/lib/python3.10/site-packages/dspy/propose/grounded_proposer.py:349, in GroundedProposer.propose_instruction_for_predictor(self, program, predictor, pred_i, prompt_model, T, demo_candidates, demo_set_i, trial_logs, tip)
    347 with dspy.settings.context(lm=prompt_model):
    348     prompt_model.kwargs["temperature"] = T
--> 349     proposed_instruction = instruction_generator.forward(
    350         demo_candidates=demo_candidates,
    351         pred_i=pred_i,
    352         demo_set_i=demo_set_i,
    353         program=program,
    354         data_summary=self.data_summary,
    355         previous_instructions=instruction_history,
    356         tip=tip,
    357     ).proposed_instruction
    358 prompt_model.kwargs["temperature"] = original_temp
    360 # Log the trace used to generate the new instruction, along with the new instruction itself

File /opt/homebrew/Caskroom/miniconda/base/envs/nano-graphrag/lib/python3.10/site-packages/dspy/propose/grounded_proposer.py:203, in GenerateModuleInstruction.forward(self, demo_candidates, pred_i, demo_set_i, program, previous_instructions, data_summary, max_demos, tip)
    200     modules = [match[0].strip() for match in matches]
    201     module_code = modules[pred_i]
--> 203 module_description = self.describe_module(
    204     program_code=self.program_code_string,
    205     program_description=program_description,
    206     program_example=task_demos,
    207     module=module_code,
    208     max_depth=10,
    209 ).module_description
    211 # Generate an instruction for our chosen module
    212 print(f"task_demos {task_demos}")

File /opt/homebrew/Caskroom/miniconda/base/envs/nano-graphrag/lib/python3.10/site-packages/dspy/predict/predict.py:91, in Predict.__call__(self, **kwargs)
     90 def __call__(self, **kwargs):
---> 91     return self.forward(**kwargs)

File /opt/homebrew/Caskroom/miniconda/base/envs/nano-graphrag/lib/python3.10/site-packages/dspy/predict/predict.py:126, in Predict.forward(self, **kwargs)
    124     completions = v2_5_generate(lm, config, signature, demos, kwargs)
    125 elif dsp.settings.experimental:
--> 126     completions = new_generate(lm, signature, dsp.Example(demos=demos, **kwargs), **config)
    127 else:
    128     completions = old_generate(demos, signature, kwargs, config, self.lm, self.stage)

File /opt/homebrew/Caskroom/miniconda/base/envs/nano-graphrag/lib/python3.10/site-packages/dspy/predict/predict.py:179, in new_generate(lm, signature, example, max_depth, **kwargs)
    177 # Generate and extract the fields.
    178 template = signature_to_template(signature, adapter=dsp.ExperimentalAdapter)
--> 179 prompt = template(example)
    180 completions = lm(prompt, **kwargs)
    181 completions = [template.extract(example, p) for p in completions]

File /opt/homebrew/Caskroom/miniconda/base/envs/nano-graphrag/lib/python3.10/site-packages/dsp/adapters/experimental_adapter.py:191, in ExperimentalAdapter.__call__(self, example, show_guidelines)
    187 rdemos = rdemos_
    189 example["augmented"] = True
--> 191 query = self.query(example)
    192 parts = [self.instructions, *rdemos, self.guidelines(show_guidelines), *ademos, query,]
    194 prompt = "\n\n---\n\n".join([p.strip() for p in parts if p])

File /opt/homebrew/Caskroom/miniconda/base/envs/nano-graphrag/lib/python3.10/site-packages/dsp/adapters/experimental_adapter.py:25, in ExperimentalAdapter.query(self, example, is_demo)
     17 has_value = [
     18     field.input_variable in example
     19     and example[field.input_variable] is not None
     20     and example[field.input_variable] != ""
     21     for field in self.fields
     22 ]
     24 if not any(has_value):
---> 25     assert False, "No input variables found in the example"
     27 for i in range(1, len(has_value)):
     28     if has_value[i - 1] and not any(has_value[i:]):

AssertionError: No input variables found in the example
NumberChiffre commented 1 week ago

Closing this issue after getting rid of Error getting source code: unhashable type: 'list'. by converting the extraction dspy.Module from using ChainOfThought to TypedChainOfThought with Pydantic models for input and output fields.