Open aidangomez opened 8 months ago
@aidangomez I like your distinction between the "preamble" and the "objective". Adding something like this would do really well in improving the output via more explicit instructions. I'm not sure if this would be added to the dspy.Signature
, a module like dspy.ChainOfThough
, or somewhere else.
I think we should prioritize implementing this concept first. Then, we can dive into the proposition you made regarding structured outputs. There have been some discussions about how to add this functionality (https://github.com/stanfordnlp/dspy/issues/264). I'd like to see something similar to 'Outlines' where the "search space" is limited based on the defined pattern, but I know there are some logistical hurdles there.
A preamble or high-level context/goal/awareness feature would be nice.
Thanks @aidangomez , this makes sense to me. We've started a refactor of the internals #390 and I'll think about the right place to do this.
I confess I'm conceptually uncomfortable with having a global "DSPy instruction" that exists in all prompts but I do agree that it adds important context and may improve most models' responses, even before optimization/compiling.
I'll update here again this week.
Thanks all, @okhat could you say more about the discomfort?
If I'm going to guess I would assume that the concern is around hurting generality of DSPy by providing a instruction that might limit DSPy's application?
I think that's a fair criticism, although I do think the gains of giving the model context of the framework it's operating within are important. It should definitely be an optional thing, I guess the question is whether it's "on by default" or not.
A wholesale alternative to a preamble is that Cohere, OpenAI, Meta, etc. each need to incorporate data that looks like DSPy so our models can recognise they're operating in the DSPy context and return in the format expected by DSPy. This is brittle since upstream DSPy changes then need to be propagated backwards into new generations of models. So pretty much not a viable option.
Would it be a solution for DSPy to have more flexibility in templates?
E.g. you could use a JSONTemplate, that would format all inputs/outputs in json, a ClassicTemplate that uses the current DSPy Name: data
format, and maybe a ToolsTemplate that tries to use tool syntax for everything.
Then people can pick what works better for their use-case or the lm they are using.
The idea was always to have Adapters — lightweight exchangeable translators between signatures (filled with optimized instructions and/or demos) and final LM calls.
https://github.com/stanfordnlp/dspy/pull/138
We can revisit this in the current refactor — particularly interested in @CyrusOfEden ’s thoughts on this for #424
If we have Adapters, we can have a CohereAdapter that is designed to work well at mapping optimized DSPy parts with whatever frozen decisions are good for Cohere.
Same for any special backend — openai function calling, outlines, etc
Any additional thoughts or plans on having Adapter strategy that allows model specific prompting strategies? Claude for example uses XML as part of its notation to draw attention and label key elements of prompt.
@canada4663 it's coming in the backend-refactor branch — we're exposing a prepare_request
that lets you override the prompts, messages, and params as you see fit.
Our TODOs before merging are roughly: [ ] Merge main into backend-refactor [ ] Create deprecation notices for the existing dsp.X modules [ ] Bump the DSPy version [ ] Merge into main
Very cool, will checkout that branch
I was exactly thinking about this particular topic last week, going in circles around how could DSpy optimized for syntax like Claude XML (or any other future adaptations for Mistral, Cohere, etc).
One idea that could be worthwhile exploring would be using the prompt libraries from the model creators to build a best practices dataset, then ground the adapter on those examples.
This is one of the existing libraries for Claude model: https://docs.anthropic.com/claude/prompt-library
@okhat I think it would be great if some priority could be placed on this issue as having hard-coded prompts would greatly contradict the design philosophy of DSPy of having declarative signatures and not dependent on the features of any particular LM (which I interpreted mostly from this HN post).
It appears that when using the Predict
module, the LM completions would come from either
https://github.com/stanfordnlp/dspy/blob/d8b8909773fc31e72cec093db2f26109590e524e/dspy/predict/predict.py#L137-L140
or
https://github.com/stanfordnlp/dspy/blob/d8b8909773fc31e72cec093db2f26109590e524e/dspy/predict/predict.py#L162-L166
depending on whether to using experimental features or not in the settings.
However, while signature_to_template()
has an argument for a custom adapter, there appears to be no way to either pass in a custom adapter or specify one in the global settings. Thus, simply allow specifying a custom adapter for Predict
would be sufficient for providing a custom prompt format.
I think using something like a Jinja template as brought up in #996 might be a neater way to specify an adapter then in pure Python code.
signature_to_template()
or Template.__call__()
needs to be overridden.Follow the following format.
and the list of fields, Template.guidelines()
needs to be overridden.ChainOfThought
more clearly (the default at least doesn't work that well for me in Llama 3 8B Instruct), pass rationale_type
to ChainOfThought.__init__()
. #809
I get the sense that DSPy isn't telling the model enough about itself in order to properly react to the prompts being given.
As a simple example let's take the quick start tutorial where you construct a simple QA program that uses chain of thought:
The resulting prompt is:
Which results in an answer of
Stephen Harper
from GPT3.5 and Cohere (both of which are wrong)If you (as a human) were handed this it's not entirely clear what the submitter is trying to get you to do.
One reasonable reading of the prompt would be:
I think performance and clarity could be improved by providing a bit more guidance and structure to the model via a preamble or some other similar strategy.
Here is an example way we could adjust the prompt to give it more structure:
Here the response from Cohere is
Which is correct! Correctness doesn't really matter as much as the fact that the prompt and the response are both now much more explicit. I think relying on JSON is probably the best bet given structured responses are going to become standard for both the closed and open-source model providers.
Let me know your thoughts on the above and whether what I'm describing makes sense or is unclear.