stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy-docs.vercel.app/
MIT License
17.92k stars 1.36k forks source link

MIPRO crashes with ValueError: No examples found for the given predictor #1137

Closed felixgao closed 3 weeks ago

felixgao commented 4 months ago

Trying to run MIPRO for my signature and it failed with the exception.

Signature Optimization...
WARNING: Projected Language Model (LM) Calls

Please be advised that based on the parameters you have set, the maximum number of LM calls is projected as follows:

- Task Model: 9 examples in dev set * 2 trials * # of LM calls in your program = (18 * # of LM calls in your program) task model calls
- Prompt Model: # data summarizer calls (max 10) + 2 * 1 lm calls in program = 12 prompt model calls

Estimated Cost Calculation:

Total Cost = (Number of calls to task model * (Avg Input Token Length per Call * Task Model Price per Input Token + Avg Output Token Length per Call * Task Model Price per Output Token)
            + (Number of calls to prompt model * (Avg Input Token Length per Call * Task Prompt Price per Input Token + Avg Output Token Length per Call * Prompt Model Price per Output Token).

For a preliminary estimate of potential costs, we recommend you perform your own calculations based on the task
and prompt models you intend to use. If the projected costs exceed your budget or expectations, you may consider:

- Reducing the number of trials (`num_trials`), the size of the trainset, or the number of LM calls in your program.
- Using a cheaper task model to optimize the prompt.
Creating basic bootstrap: 1/1
  0%|                                                                                                                     | 0/9 [00:00<?, ?it/s]Expected: 19, Diff: {'hallucinated': {'n/a': {'pred': None, 'ground': None}, '178.08': {'pred': None, 'ground': None}, '85_4006070': {'pred': None, 'ground': None}, '41.98': {'pred': None, 'ground': None}, '143.04': {'pred': None, 'ground': None}, '6200_oak_tree_blvd___suite_250_independence,_oh_44131': {'pred': None, 'ground': None}, 'mary_a_dawkins': {'pred': None, 'ground': None}, 'xxx_xx_2489': {'pred': None, 'ground': None}, '69.71': {'pred': None, 'ground': None}, '1609_deatsville_hwy_millbrook,_al_36054': {'pred': None, 'ground': None}, '2307.11': {'pred': None, 'ground': None}, '87.86': {'pred': None, 'ground': None}, '33.45': {'pred': None, 'ground': None}, 'hut_american_group_llc': {'pred': None, 'ground': None}, 'r010981778': {'pred': None, 'ground': None}, '2129.03': {'pred': None, 'ground': None}, 'al': {'pred': None, 'ground': None}, '0000034180': {'pred': None, 'ground': None}}, 'incorrect': {'third_party_sick_pay': {'pred': None, 'ground': 'N/A'}}}
 11%|████████████                                                                                                 | 1/9 [00:02<00:20,  2.53s/it]Expected: 12, Diff: {'hallucinated': {'n/a': {'pred': None, 'ground': None}, '178.08': {'pred': None, 'ground': None}, '143.04': {'pred': None, 'ground': None}, '6200_oak_tree_blvd___suite_250_independence,_oh_44131': {'pred': None, 'ground': None}, 'xxx_xx_2489': {'pred': None, 'ground': None}, '1609_deatsville_hwy_millbrook,_al_36054': {'pred': None, 'ground': None}, '87.86': {'pred': None, 'ground': None}, '33.45': {'pred': None, 'ground': None}, 'al': {'pred': None, 'ground': None}, '0000034180': {'pred': None, 'ground': None}, '69.71': {'pred': None, 'ground': None}}, 'incorrect': {'third_party_sick_pay': {'pred': None, 'ground': 'N/A'}}}
 22%|████████████████████████▏                                                                                    | 2/9 [00:04<00:15,  2.22s/it]Expected: 17, Diff: {'hallucinated': {'n/a': {'pred': None, 'ground': None}, '178.08': {'pred': None, 'ground': None}, '85_4006070': {'pred': None, 'ground': None}, '143.04': {'pred': None, 'ground': None}, '6200_oak_tree_blvd___suite_250_independence,_oh_44131': {'pred': None, 'ground': None}, 'xxx_xx_2489': {'pred': None, 'ground': None}, '69.71': {'pred': None, 'ground': None}, '1609_deatsville_hwy_millbrook,_al_36054': {'pred': None, 'ground': None}, '2307.11': {'pred': None, 'ground': None}, '87.86': {'pred': None, 'ground': None}, '33.45': {'pred': None, 'ground': None}, 'hut_american_group_llc': {'pred': None, 'ground': None}, 'r010981778': {'pred': None, 'ground': None}, '2129.03': {'pred': None, 'ground': None}, 'al': {'pred': None, 'ground': None}, '0000034180': {'pred': None, 'ground': None}}, 'incorrect': {'third_party_sick_pay': {'pred': None, 'ground': 'N/A'}}}
 33%|████████████████████████████████████▎                                                                        | 3/9 [00:06<00:13,  2.31s/it]Expected: 14, Diff: {'hallucinated': {'n/a': {'pred': None, 'ground': None}, '178.08': {'pred': None, 'ground': None}, '85_4006070': {'pred': None, 'ground': None}, '41.98': {'pred': None, 'ground': None}, '143.04': {'pred': None, 'ground': None}, '6200_oak_tree_blvd___suite_250_independence,_oh_44131': {'pred': None, 'ground': None}, 'mary_a_dawkins': {'pred': None, 'ground': None}, '69.71': {'pred': None, 'ground': None}, '1609_deatsville_hwy_millbrook,_al_36054': {'pred': None, 'ground': None}, '87.86': {'pred': None, 'ground': None}, 'hut_american_group_llc': {'pred': None, 'ground': None}, 'r010981778': {'pred': None, 'ground': None}, '2129.03': {'pred': None, 'ground': None}}, 'incorrect': {'third_party_sick_pay': {'pred': None, 'ground': 'N/A'}}}
 44%|████████████████████████████████████████████████▍                                                            | 4/9 [00:09<00:11,  2.33s/it]Expected: 26, Diff: {'hallucinated': {'employers_address_and_zip_code': {'pred': '6200 OAK TREE BLVD - SUITE 250 INDEPENDENCE, OH 44131', 'ground': None}, 'employees_ssa_number': {'pred': 'XXX-XX-2489', 'ground': None}, 'employees_address_and_zip_code': {'pred': '1609 DEATSVILLE HWY MILLBROOK, AL 36054', 'ground': None}}, 'incorrect': {'box_12a': {'pred': 'DD | 69.71', 'ground': 'DD 69.71'}}}
 56%|████████████████████████████████████████████████████████████▌                                                | 5/9 [00:12<00:10,  2.74s/it]Expected: 16, Diff: {'hallucinated': {'n/a': {'pred': None, 'ground': None}, '178.08': {'pred': None, 'ground': None}, '85_4006070': {'pred': None, 'ground': None}, '41.98': {'pred': None, 'ground': None}, '143.04': {'pred': None, 'ground': None}, 'mary_a_dawkins': {'pred': None, 'ground': None}, 'xxx_xx_2489': {'pred': None, 'ground': None}, '69.71': {'pred': None, 'ground': None}, '1609_deatsville_hwy_millbrook,_al_36054': {'pred': None, 'ground': None}, '2307.11': {'pred': None, 'ground': None}, '87.86': {'pred': None, 'ground': None}, '33.45': {'pred': None, 'ground': None}, 'hut_american_group_llc': {'pred': None, 'ground': None}, 'r010981778': {'pred': None, 'ground': None}, '2129.03': {'pred': None, 'ground': None}}, 'incorrect': {'third_party_sick_pay': {'pred': None, 'ground': 'N/A'}}}
 67%|████████████████████████████████████████████████████████████████████████▋                                    | 6/9 [00:15<00:07,  2.61s/it]Expected: 12, Diff: {'hallucinated': {'n/a': {'pred': None, 'ground': None}, '85_4006070': {'pred': None, 'ground': None}, '6200_oak_tree_blvd___suite_250_independence,_oh_44131': {'pred': None, 'ground': None}, 'mary_a_dawkins': {'pred': None, 'ground': None}, 'xxx_xx_2489': {'pred': None, 'ground': None}, '69.71': {'pred': None, 'ground': None}, '1609_deatsville_hwy_millbrook,_al_36054': {'pred': None, 'ground': None}, '87.86': {'pred': None, 'ground': None}, 'hut_american_group_llc': {'pred': None, 'ground': None}, 'r010981778': {'pred': None, 'ground': None}, '2307.11': {'pred': None, 'ground': None}}, 'incorrect': {'third_party_sick_pay': {'pred': None, 'ground': 'N/A'}}}
 78%|████████████████████████████████████████████████████████████████████████████████████▊                        | 7/9 [00:17<00:04,  2.46s/it]Expected: 12, Diff: {'hallucinated': {'n/a': {'pred': None, 'ground': None}, '69.71': {'pred': None, 'ground': None}, '2307.11': {'pred': None, 'ground': None}, '85_4006070': {'pred': None, 'ground': None}, '0000034180': {'pred': None, 'ground': None}, '41.98': {'pred': None, 'ground': None}, '87.86': {'pred': None, 'ground': None}, '143.04': {'pred': None, 'ground': None}, 'r010981778': {'pred': None, 'ground': None}, '33.45': {'pred': None, 'ground': None}, 'mary_a_dawkins': {'pred': None, 'ground': None}}, 'incorrect': {'third_party_sick_pay': {'pred': None, 'ground': 'N/A'}}}
 89%|████████████████████████████████████████████████████████████████████████████████████████████████▉            | 8/9 [00:19<00:02,  2.28s/it]Expected: 16, Diff: {'hallucinated': {'n/a': {'pred': None, 'ground': None}, '85_4006070': {'pred': None, 'ground': None}, '41.98': {'pred': None, 'ground': None}, '143.04': {'pred': None, 'ground': None}, '6200_oak_tree_blvd___suite_250_independence,_oh_44131': {'pred': None, 'ground': None}, 'mary_a_dawkins': {'pred': None, 'ground': None}, 'xxx_xx_2489': {'pred': None, 'ground': None}, '69.71': {'pred': None, 'ground': None}, '1609_deatsville_hwy_millbrook,_al_36054': {'pred': None, 'ground': None}, '2307.11': {'pred': None, 'ground': None}, '87.86': {'pred': None, 'ground': None}, '33.45': {'pred': None, 'ground': None}, 'hut_american_group_llc': {'pred': None, 'ground': None}, 'r010981778': {'pred': None, 'ground': None}, '0000034180': {'pred': None, 'ground': None}}, 'incorrect': {'third_party_sick_pay': {'pred': None, 'ground': 'N/A'}}}
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:21<00:00,  2.40s/it]
ChainOfThought(GenerateExtraction(context, task -> answer
    instructions='Extracting requested information from a document.'
    context = Field(annotation=str required=True json_schema_extra={'desc': 'The document text.', '__dspy_field_type': 'input', 'prefix': 'Context:'})
    task = Field(annotation=str required=True json_schema_extra={'desc': 'The task to extract necessary information from document. Use the mapping values for the keys.', '__dspy_field_type': 'input', 'prefix': 'Task:'})
    answer = Field(annotation=str required=True json_schema_extra={'desc': "A list of expected key-value pairs, if a key doesn't have a value return N/A. \n        IMPORTANT!!! The list must be semi-colon separated.  \n        Do not include any other information.", '__dspy_field_type': 'output', 'prefix': 'Answer:'})
)) has no examples. Example sets: {0: []}
Traceback (most recent call last):
  File "/Users/ggao/github/ged/gemini/gemini/doc_qa.py", line 126, in <module>
    fire.Fire(main)
  File "/Users/ggao/github/ged/gemini/.venv/lib/python3.11/site-packages/fire/core.py", line 143, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ggao/github/ged/gemini/.venv/lib/python3.11/site-packages/fire/core.py", line 477, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/Users/ggao/github/ged/gemini/.venv/lib/python3.11/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ggao/github/ged/gemini/gemini/doc_qa.py", line 115, in main
    optmized = signature_optimization_mipro(module, train, metric_fn, gemini_pro_1_5, gemini_flash_1_5, thread_count)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ggao/github/ged/gemini/gemini/optimizers.py", line 23, in signature_optimization_mipro
    compiled_program = teleprompter.compile(module, trainset=train, num_trials=2,
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ggao/github/ged/gemini/.venv/lib/python3.11/site-packages/dspy/teleprompt/mipro_optimizer.py", line 460, in compile
    instruction_candidates, _ = self._generate_first_N_candidates(
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ggao/github/ged/gemini/.venv/lib/python3.11/site-packages/dspy/teleprompt/mipro_optimizer.py", line 282, in _generate_first_N_candidates
    raise ValueError("No examples found for the given predictor")
ValueError: No examples found for the given predictor

relevant code

def signature_optimization_mipro(module, train, metric_fn, prompt_lm, task_lm, thread_count: int = 1) -> SimpleDocumentTextQASignature:

    teleprompter = MIPRO(prompt_model=prompt_lm, task_model=task_lm, metric=metric_fn, num_candidates=2,
                         init_temperature=1.0, verbose=True, )
    compiled_program = teleprompter.compile(module, trainset=train, num_trials=2,
                                            max_bootstrapped_demos=1, max_labeled_demos=0,
                                            eval_kwargs={'num_threads': thread_count, 'display_progress': True},
                                            requires_permission_to_run=False)
    compiled_program.save("optimized_signature_mipro.json")
    module = SimpleDocumentTextQASignature()
    module.load("optimized_signature_mipro.json")
    return module

class SimpleDocumentTextQASignature(SimpleDocumentTextQA):

    def forward(self, context, question):
        result = super().forward(context, question)
        return dspy.Prediction(
            answer = result.answer,
            reasoning = result.rationale,
        )

   module = SimpleDocumentTextQASignature()
   optmized = signature_optimization_mipro(module, train, metric_fn, gemini_pro_1_5, gemini_flash_1_5, thread_count)
felixgao commented 4 months ago

I did some debugging it seems that all of my examples are returning False in the metrics. I feel this should somehow still works because the task is trying to figure out what is the correct prompt to use to do a task.

tom-doerr commented 4 months ago

You could increase the num_candidates value so it tries more then 10 times.

class MIPRO(Teleprompter):
    def __init__(
        self,
        metric,
        prompt_model=None,
        task_model=None,
        teacher_settings={},
        num_candidates=10,
        init_temperature=1.0,
        verbose=False,
        track_stats=True,
        view_data_batch_size=10,
    ):

However the best option might be to give it at least one sample that also has a label.

denisp-adsk commented 3 weeks ago

@felixgao Did you end up finding a solution? Also blocked on this when trying to fit a classifier using DSPY

okhat commented 3 weeks ago

Probably fixed now in DSPy 2.5 with the MIPROv2 updates? Closing but feel free to re-open.