stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy.ai
MIT License
18.76k stars 1.44k forks source link

`dsp.Example` class not compatible with multiprocessing #73

Closed danielmachlab closed 1 year ago

danielmachlab commented 1 year ago

I am trying to use DSP in a ipynb I wrote that sends hundreds of prompts to the openai api. Because of the volume, I previously used Pool from the multiprocessing library to parallelize my requests. With DSP, however, I am not able to do this because the prompts, which are represented by the dsp.Example class, are not pickleable (since the __getstate__ and __setstate__ methods are undefined), and thus not compatible with Pool.

Without multiprocessing, making these requests to the openai api take 10-15 minutes instead of seconds.

I've created this gist with code from the dsp intro.ipynb to illustrate my use case and re-produce the error: https://colab.research.google.com/gist/danielmachlab/fc79ce5d7e8eb7c505ea53ae56066253/knn_example.ipynb#scrollTo=vjtdEHWa19hD

The solution to this issue should be to define the __getstate__ and the __setstate__ methods.

okhat commented 1 year ago

Oh by the way, in DSPy we have built-in support for multi-threading in the Evaluate class. Have you seen that?

It works perfectly well when the model is hosted on a server like TGI Client, VLLM, or OpenAI / Cohere.

Do you still need this?

okhat commented 1 year ago

closing by default but feel free to reopen