Extend generation and template.extract producing empty fields and malformed outputs.

mikeedjones commented 2 months ago

The extend generation logic in dsp/primitives/predict.py is producing the behaviour illustrated below.

The example uses an artificially low max_tokens to illustrate the bug with real LM calls.

import dspy

llm = dspy.OpenAI(model='gpt-3.5-turbo', max_tokens=10)
dspy.settings.configure(lm=llm)
class SciGen(dspy.Signature):
    """context and answer on science questions"""
    question = dspy.InputField(desc="Input question")
    sun_context = dspy.OutputField(desc="sun_context") 
    moon_context = dspy.OutputField(desc="moon_context") 
    earth_context = dspy.OutputField(desc="earth_context") 
    relative_distances_context = dspy.OutputField(desc="relative_distances_context") 
    answer = dspy.OutputField(desc="Answer only when you have all the context fields.")

context_generator = dspy.Predict(SciGen, n=1)
response = context_generator(question="Why do we have solar eclipse?")

response

# Prediction(
#     sun_context='During a solar eclipse, the Moon passes between the',
#     moon_context='', 
#     earth_context="Earth and the Sun, blocking the Sun's light",
#     relative_distances_context='',
#     answer='Answer: A solar eclipse occurs when the Moon'.
# )

Explanation

When the model does not generate all the correct field prefixes in the correct order, logic in dsp/primitives/predict.py is used to continue the generation until the signature is completed.

https://github.com/stanfordnlp/dspy/blob/55510eec1b83fa77f368e191a363c150df8c5b02/dsp/primitives/predict.py#L95

The "Last field"'s value is set to "" - presumably to throw away a partial completion?

https://github.com/stanfordnlp/dspy/blob/55510eec1b83fa77f368e191a363c150df8c5b02/dsp/primitives/predict.py#L98

And a further call to the LM is made with the updated prompt - with the "last field" as the trailing prefix - ie

"""
...
Moon Context: """

Whilst the prompt is rendered correctly,template.extract compares the field value to None to decide where the new completion from the LM is meant to go in the signature:

https://github.com/stanfordnlp/dspy/blob/55510eec1b83fa77f368e191a363c150df8c5b02/dsp/templates/template_v2.py#L156

Since the value of the last field is "" the new completion is instead attached to a field which is still None. This logic stands in contrast to how the prompt was built - where the field value is compared to "" to determine if it is "complete".

https://github.com/stanfordnlp/dspy/blob/55510eec1b83fa77f368e191a363c150df8c5b02/dsp/templates/template_v2.py#L81

The bug can be fixed by either

updating the extend generation logic to assign None to the partial field
updating the TemplateV2.extract logic to check for "" when assigning outputs
updating the TemplateV2.query logic to exclude fields with empty-string values as completed when determining which field to leave as the trailing prefix.

Speculation

I think the bug makes Dspy fail more quietly when the LM does not generate the fields as specified in the "Follow the following format" section of the prompt - as reported by https://github.com/stanfordnlp/dspy/pull/1169#issuecomment-2179472268 after a now reverted change to the extend generation logic (#920). The current logic means that at least one extra OutputField is added to the results for every recursion in generate - independent of what the LM produces. I think this means complex signatures will be reliably completed (up to the user-provided recursion depth) even if the LM is not actually generating the required field prefixes.

I think a "fix" to this bug might cause some examples to start to "fail" more loudly when the extend generation logic reaches its maximum recursion depth.

A more atomised fix to this bug would ensure that the last incomplete field is skipped correctly when the generation is extended, with appropriate documentation.

Cheers

mikeedjones commented 1 month ago

@okhat @arnavsinghvi11 have you been able to reproduce this issue at all? Would be great to get this fixed!

mikeedjones commented 1 month ago

Would be great to get a view on how best to fix this! Cheers :)

mikeedjones commented 3 weeks ago

Potentially https://github.com/stanfordnlp/dspy/issues/1312 could be linked?

stanfordnlp / dspy

Extend generation and template.extract producing empty fields and malformed outputs. #1232

Explanation

Speculation