stanfordnlp / dspy

DSPy: The framework for programming—not prompting—language models
https://dspy.ai
MIT License
19.27k stars 1.47k forks source link

MIPRO Bootstrap + Evaluation Parallelism #1622

Open XenonMolecule opened 1 month ago

XenonMolecule commented 1 month ago

This PR includes two main updates:

  1. Enables nested parallelism in Evaluate. Before the way that signal handling for user terminations was incompatible with nested threading. If a user program used multi-threading (for instance calling BootstrapFewshotWithRandomSearch as a meta-optimization within a program they were evaluating or optimizing) then Evaluate would throw an exception. This nested multithreading now works.
  2. MIPRO bootstraps in parallel. Theoretically this change can be applied to BootstrapFewshotWithRandomSearch as well. Now all runs of the program to bootstrap the fewshot demonstrations occur at the same time. This is only enabled by setting bootstrap_parallel=True in the MIPROv2 class (which is False by default). The main TODO is to fix printing on this, because currently all the evaluation bars print simultaneously and make for a bit of an ugly printout.

I would appreciate a more thorough review of this PR since it touches functionality outside of my area of expertise (by modifying Evaluate), and because I am a relative novice when it comes to multithreading in python.

mikeedjones commented 1 month ago

Which exception was thrown with a nested threadpool? An MRE/tests to avoid regressions would be great :)

XenonMolecule commented 1 month ago

Which exception was thrown with a nested threadpool? An MRE/tests to avoid regressions would be great :)


        with inputs:
                Example({'context': 'a boy is not concentrating on a machine', 'question': 'Can we logically conclude for sure that a boy is not concentrating on a typewriter?'}) (input_keys={'context', 'question'})

Stack trace:
        Traceback (most recent call last):
  File "/Users/michaelryan/Documents/School/Stanford/misc/dspy_updates/dspy/dspy/evaluate/evaluate.py", line 175, in wrapped_program
    prediction = program(**example.inputs())
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/michaelryan/Documents/School/Stanford/misc/dspy_updates/dspy/dspy/primitives/program.py", line 26, in __call__
    return self.forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/michaelryan/Documents/School/Stanford/misc/dspy_updates/dspy/testing/tasks/scone.py", line 68, in forward
    new_prog = optimizer.compile(dspy.ChainOfThought("context,question->answer"), trainset=self.trainset, valset=self.testset)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/michaelryan/Documents/School/Stanford/misc/dspy_updates/dspy/dspy/teleprompt/random_search.py", line 118, in compile
    score, subscores = evaluate(program, return_all_scores=True)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/michaelryan/Documents/School/Stanford/misc/dspy_updates/dspy/dspy/evaluate/evaluate.py", line 211, in __call__
    reordered_devset, ncorrect, ntotal = self._execute_multi_thread(
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/michaelryan/Documents/School/Stanford/misc/dspy_updates/dspy/dspy/evaluate/evaluate.py", line 119, in _execute_multi_thread
    with ThreadPoolExecutor(max_workers=num_threads) as executor, interrupt_handler_manager():
  File "/opt/miniconda3/envs/dspy_maintenance/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/Users/michaelryan/Documents/School/Stanford/misc/dspy_updates/dspy/dspy/evaluate/evaluate.py", line 108, in interrupt_handler_manager
    signal.signal(signal.SIGINT, interrupt_handler)
  File "/opt/miniconda3/envs/dspy_maintenance/lib/python3.12/signal.py", line 58, in signal
    handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: signal only works in main thread of the main interpreter```
okhat commented 1 month ago

@XenonMolecule I love this and @mikeedjones thanks for the review :D

I want to merge this and to use it. I see failing tests. Is that expected? Are there blockers for merge?

XenonMolecule commented 1 month ago

@XenonMolecule I love this and @mikeedjones thanks for the review :D

I want to merge this and to use it. I see failing tests. Is that expected? Are there blockers for merge?

The failing tests only started when I pushed a commit that changed comments, so not sure what happened. Looking into it.