DSPy Integration - Githubissues

Right now, the current state of LLMs involves a lot of "prompt engineering" that the user needs to do, not only to elicit the the correct response but sometimes also the correct style of response (i.e.: json). This kind of manual, trial-error style of prompting can be quite tedious.

Stanfordnlp's DSPy can alleviate a lot of the core issues around prompt engineering. DSPy offers a smarter way to work with LLMs. It separates the flow of the application (modules) from the parameters (LM prompts and weights) of each step. Second, DSPy introduces new optimizers, which are LM-driven algorithms that can automatically tune the prompts and/or the weights of your LM calls, given a metric you want to maximize (i.e.: EM, F1, or RAGAS for long-form). Not only does it streamline the process of prompting but also provides a more structured format to formulate your optimizers/modules/etc using PyTorch style syntax.

I have tried using DSPy separately from haystack for small experiments, but I would love to use it within the larger systems I have built in haystack. I can technically do this with the custom Component class but wanted to throw the idea out of integrating DSPy formally as a suggestion as well.

Combining Haystack's ecosystem of building out LLM systems along with the algorithmic-based prompt engineering provided by DSPy could be very powerful in building better LLM systems with formalized syntax and less manual effort of prompt engineering.

My only complaint with DSPy is that the base prompts used by the optimizers to improve the prompts are not easily changed depending the model. For example, the mistral instruct model expects the prompt to be in the following manner:<s>[INST] {prompt} [/INST]. However, with haystack we could formalize a way to swap out the optimizer prompts depending on model use.

Additional resources:

DSPy docs

DSPy intro notebook

deepset-ai / haystack

DSPy Integration #7345