stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy-docs.vercel.app/
MIT License
16.32k stars 1.26k forks source link

Extend generation and template.extract producing empty fields and malformed outputs. #1232

Open mikeedjones opened 2 months ago

mikeedjones commented 2 months ago

The extend generation logic in dsp/primitives/predict.py is producing the behaviour illustrated below.

The example uses an artificially low max_tokens to illustrate the bug with real LM calls.

import dspy

llm = dspy.OpenAI(model='gpt-3.5-turbo', max_tokens=10)
dspy.settings.configure(lm=llm)
class SciGen(dspy.Signature):
    """context and answer on science questions"""
    question = dspy.InputField(desc="Input question")
    sun_context = dspy.OutputField(desc="sun_context") 
    moon_context = dspy.OutputField(desc="moon_context") 
    earth_context = dspy.OutputField(desc="earth_context") 
    relative_distances_context = dspy.OutputField(desc="relative_distances_context") 
    answer = dspy.OutputField(desc="Answer only when you have all the context fields.")

context_generator = dspy.Predict(SciGen, n=1)
response = context_generator(question="Why do we have solar eclipse?")

response

# Prediction(
#     sun_context='During a solar eclipse, the Moon passes between the',
#     moon_context='', 
#     earth_context="Earth and the Sun, blocking the Sun's light",
#     relative_distances_context='',
#     answer='Answer: A solar eclipse occurs when the Moon'.
# )

Explanation

When the model does not generate all the correct field prefixes in the correct order, logic in dsp/primitives/predict.py is used to continue the generation until the signature is completed.

https://github.com/stanfordnlp/dspy/blob/55510eec1b83fa77f368e191a363c150df8c5b02/dsp/primitives/predict.py#L95

The "Last field"'s value is set to "" - presumably to throw away a partial completion?

https://github.com/stanfordnlp/dspy/blob/55510eec1b83fa77f368e191a363c150df8c5b02/dsp/primitives/predict.py#L98

And a further call to the LM is made with the updated prompt - with the "last field" as the trailing prefix - ie

"""
...
Moon Context: """

Whilst the prompt is rendered correctly,template.extract compares the field value to None to decide where the new completion from the LM is meant to go in the signature:

https://github.com/stanfordnlp/dspy/blob/55510eec1b83fa77f368e191a363c150df8c5b02/dsp/templates/template_v2.py#L156

Since the value of the last field is "" the new completion is instead attached to a field which is still None. This logic stands in contrast to how the prompt was built - where the field value is compared to "" to determine if it is "complete".

https://github.com/stanfordnlp/dspy/blob/55510eec1b83fa77f368e191a363c150df8c5b02/dsp/templates/template_v2.py#L81

The bug can be fixed by either

Speculation

I think the bug makes Dspy fail more quietly when the LM does not generate the fields as specified in the "Follow the following format" section of the prompt - as reported by https://github.com/stanfordnlp/dspy/pull/1169#issuecomment-2179472268 after a now reverted change to the extend generation logic (#920). The current logic means that at least one extra OutputField is added to the results for every recursion in generate - independent of what the LM produces. I think this means complex signatures will be reliably completed (up to the user-provided recursion depth) even if the LM is not actually generating the required field prefixes.

I think a "fix" to this bug might cause some examples to start to "fail" more loudly when the extend generation logic reaches its maximum recursion depth.

A more atomised fix to this bug would ensure that the last incomplete field is skipped correctly when the generation is extended, with appropriate documentation.

Cheers

mikeedjones commented 1 month ago

@okhat @arnavsinghvi11 have you been able to reproduce this issue at all? Would be great to get this fixed!

mikeedjones commented 1 month ago

Would be great to get a view on how best to fix this! Cheers :)

mikeedjones commented 3 weeks ago

Potentially https://github.com/stanfordnlp/dspy/issues/1312 could be linked?