Implement FunSearch in DSPy

mattredlon commented 9 months ago

A recent paper by DeepMind demonstrated the use of FunSearch, an "evolutionary procedure" pairing an LLM with a systematic evaluator, on problems in extremal combinatorics (cap set) and algorithm design (online bin packing). The paper, while excellent and containing interesting work, went viral due to sensational media positioning such as: "DeepMind AI outdoes human mathematicians on unsolved problem" and the resulting (predictable) AGI/ASI grifting.

While reading this paper I couldn't help but think of DSPy. I continue to believe this framework is "skating to where the puck will be" in terms of building truly useful, trustworthy, maintainable, LM-based systems, at least until a dramatic breakthrough in LM capabilities (e.g. reasoning, world model, etc.). It recognizes the inherent variability/unreliability and limitations of LMs today and provides useful guardrails around them in production systems. This was further enhanced with Suggest and Assert. It also enables us to take advantage of the LM's ability to generate novel solutions or approaches - a core element of FunSearch.

I suspect replicating FunSearch in DSPy would enable it to address an even larger class of real-world scientific and business problems. As I understand it there are at least three areas which may require additional research/development to implement FunSearch (red in diagram):

Signatures would need the ability to use an “immutable” prompt “skeleton”. This is how DeepMind handled it at least, with a simple part of the prompt (a subset of the Python code called the “program”) being varied in each prompt.
We would need to dust off dspy.PythonInterpreter to execute the latest code variants.
Assuming my diagram is directionally correct, I'm unsure if the teleprompter is where we would want to execute and score each variant and then persist the “program” along with its evaluation score. We would perhaps then implement our genetic algorithm / island strategy as a retriever to select the example "programs" to include from the module, but someone could tell me this belongs in teleprompter as well.

dspy_flowchart

I’ll share additional thoughts as I proceed. I only recently dove into the DSPy code base, so any feedback from the community on my understanding (or lack thereof) and guidance on research would be greatly appreciated!

darinkishore commented 9 months ago

@mattredlon

Spitballing, but take a look at this issue/PR combo. Would this be what you were thinking about for item 1? (Signatures would need the ability to use an “immutable” prompt “skeleton”.)

https://github.com/darinkishore/dspy/issues/68

https://github.com/darinkishore/dspy/pull/69

mattredlon commented 9 months ago

These are excellent, @darinkishore! I am not fluent enough in the code base or design principles of DSPy yet to be confident in my evaluation of your issue/PR so please be patient with me, however, there was one element where I wasn't sure if it should be addressed in the Signature, in the Module (perhaps leveraging a new genetic/islands retriever), or in the Teleprompter/compiler. You included it as:

Specifications Sample Signatures: Create a method to sample k signatures based on certain logic, akin to DeepMind's priority functions sorting and selection.

Action Items Design and implement the sampling of signature variations and the prompt construction methodology.

Are you envisioning the Signature itself being responsible for sampling k "programs" (to use FunSearch vernacular) each hop? If so, would you envision the Module as the location for executing the Python generated via the latest Signature/"program" variant? If so, where in the process would we persist the latest "program" and what would the role of the Teleprompter/compiler be?

okhat commented 9 months ago

Hey @mattredlon,

teleprompter is where we would want to execute and score each variant and then persist the “program” along with its evaluation score. We would perhaps then implement our genetic algorithm / island strategy as a retriever to select the example "programs" to include from the module, but someone could tell me this belongs in teleprompter as well.

With the big caveat that I (i) read your post carefully (ii) never read the FunSearch paper or any content about it yet unfortunately, I think this all belongs in a new teleprompter.

We'll rename teleprompters to optimizers so let me start using that term now. Basically I can imagine a dspy.optimize.FunSearch optimizer. It will take FunSearch(**general_config_and_hyperparams) and will have .compile(program, trainset) as usual.

In terms of the original design, Signatures in DSPy are stateless and immutable.* The optimized parts are held inside the modules, basically/mainly dspy.Predict. (Other modules generally use dspy.Predict in different ways.)

The main things that can be optimized are typically instructions and demonstrations (or the LM weights if we're finetuning). I assume FunSearch is more about the instructions?

Footnote: Even I think eventually signatures should be stateless and immutable in principle, we took some shortcuts in SignatureOptimizer where it does mutate the instruction of the signature. Probably not the right long-term thing to do.

kushinm commented 6 months ago

Thanks for starting this thread @mattredlon ! I was curious whether you or others have made any progress on this front (maybe building on the suggestions made by @okhat above)?

stanfordnlp / dspy

Implement FunSearch in DSPy #253