Open thomasahle opened 5 months ago
Yeah :/ Good point. Same for ChainOfThought.
I wonder if we can actually just resolve this by making a shallow wrapper and renaming the current thing to CorePredict and CoreChainOfThought?
This is less of an engineering thing and more a programming language theory thing, but I've been thinking about what category Predict, ChainofThought, etc. falls under. I think there may be a missing category in the metamodel, which I've been calling a 'strategy' in my head: a module that returns a module. Conceptually, this opens the door to strategy optimisation (optimising the module that returns the module separately to the final signature) but the main benefit for me is just allowing us to reason about higher-order functions (important with functorial things like lists). I can imagine strategies for handling mapping on lists, a tree of thought one, a graph of thought one, or even ones that add MemGPT/Self-RAG support to another strategy.
As I understand DSPy, a "program" is a "module" composed of other modules, such as "Predictors" (ChainOfThought/Predict/ReAct...), "Retrievers" or other "Subprograms". But, if we are going to categorize "Predictors" differently, I think, I would call them prompting "Techniques"
I think it's good to just try and follow pytorch on this. There a nn.Sequence is still an nn.Module even though it takes a list of modules.
Maybe the current predict code could be moved to a function that the predict module calls? A bit like your Core Predict idea @okhat
Regarding ChainOfThought, it seems like we could just replace it with
class ChainOfThought(Module):
def __init__(self, signature, rationale_type=None, **config):
super().__init__(**config)
signature = ensure_signature(signature)
*_keys, last_key = signature.output_fields.keys()
rationale_type = rationale_type or dspy.OutputField(
prefix="Reasoning: Let's think step by step in order to",
desc="${produce the " + last_key + "}. We ...",
)
self.extended_signature = signature.prepend("rationale", rationale_type, type_=str)
self.predict = dspy.Predict(self.extended_signature)
def forward(self, **kwargs):
return self.predict(**kwargs)
This still passes all my tests, except those for the (bayesian) signature optimizer, which has some hacks regarding extended_signatures.
Or I guess a CorePredictor
would be nice, as you say, since it serves as a place to "store signatures", so they can be changed, while keeping the Signature class itself immutable. E.g. in the Signature optimizer:
# Go through our module's predictors
for p_i, (p_old, p_new) in enumerate(zip(module.predictors(), module_clone.predictors())):
candidates_ = latest_candidates[id(p_old)] # Use the most recently generated candidates for evaluation
if len(module.predictors()) > 1:
candidates_ = all_candidates[id(p_old)] # Unless our program has multiple predictors, in which case we need to reevaluate all prompts with the new prompt(s) for the other predictor(s)
# For each candidate
for c_i, c in enumerate(candidates_):
# Get the candidate instruction and prefix
instruction, prefix = c.proposed_instruction.strip('"').strip(), c.proposed_prefix_for_output_field.strip('"').strip()
# Set this new module with our instruction / prefix
if (hasattr(p_new, 'extended_signature')):
*_, last_key = p_new.extended_signature.fields.keys()
p_new.extended_signature = p_new.extended_signature \
.with_instructions(instruction) \
.with_updated_fields(last_key, prefix=prefix)
else:
*_, last_key = p_new.extended_signature1.fields.keys()
p_new.extended_signature1 = p_new.extended_signature1 \
.with_instructions(instruction) \
.with_updated_fields(last_key, prefix=prefix)
*_, last_key = p_new.extended_signature2.fields.keys()
p_new.extended_signature2 = p_new.extended_signature2 \
.with_instructions(instruction) \
.with_updated_fields(last_key, prefix=prefix)
If we refactor this, we should be sure to find a way to avoid the two cases of extended_signature vs extended_signature1 and extended_signature2.
@thomasahle I think the CorePredict will have self.instructions and self.demos, instead of any kind of changes to self.signature. Once a module is created (including CorePredict) the signature will never be changed --- that's my current thinking at least, I hope it's possible to realize in practice.
Doesn't signature-optimizer also change the field descriptions and prefixes though?
For most purposes
dspy.Predict
behaves the same was as adspy.Module
. But if you try to pass aPredict
directly to an optimizer, you'll notice that it's lacking a lot of (simple) methods that Module has.Writing unittests I've often found myself writing unnecessary classes like