Several outputs per input

stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models

https://dspy-docs.vercel.app/

MIT License

18.48k stars 1.42k forks source link

Several outputs per input #314

Closed chandlj closed 8 months ago

chandlj commented 9 months ago

We are working on building a model that outputs a list of outputs per input context. In LangChain, we can use output parsers to enforce list-like formats as follows:

from langchain_core.pydantic_v1 import BaseModel, Field

class Output(BaseModel):
    question: str = Field(...)
    answer: str = Field(...)

class ListModel(BaseModel):
    outputs: list[Output] = Field(...)

parser = JsonOutputParser(ListModel)
parser.get_format_instructions()

When you ask GPT-4, for example, to generate 5 outputs using the Json formatting instructions, it will output a list of neat, json formatted question and answer pairs.

It's not clear how to do this in DSPy using Signatures. Is there a way to specify a Signature that is a list of another base Signature? Or can you specify a list format in a dspy.OutputField?

The other way I thought about doing it was using dspy.Suggest as shown in the paper introducing Suggest and Assert, like so:

class OurModule(dspy.Module):
    def __init__(self):
        self.predict = dspy.Predict(OurSignature)

    def forward(self, context):
        answers = []
        for _ in range(5):
            answer = self.predict(context=context)
            dspy.Suggest(is_answer_unique(answer, answers), "Make sure answers are unique enough")
        return answers

From my understanding however, this calls the LLM 5 separate times with the same context, which is less performant than generating all 5 answers at once. Any help on this would be appreciated!

arnavsinghvi11 commented 9 months ago

Hi @chandlj ,

Thanks for the question!

You can indeed specify a list type formatting for the dspy.OutputField of the Signature. To a generate a specified number of responses within the outputs, you can mention this within the instruction and/or introduce an dspy.InputField that takes in the user-expected number of outputted responses. For example,

class BasicQA(dspy.Signature):
    """Return a list of specified number of possible answers to the question."""

    question = dspy.InputField()
    number = dspy.InputField(desc="number of possible answers to return")
    answer = dspy.OutputField(format=list, desc="unique possible answers")`

class TestModule(dspy.Module):
    def __init__(self):
        super().__init__()
        self.generate_answer = dspy.ChainOfThought(BasicQA)

    def forward(self, question, number):
        prediction = self.generate_answer(question=question, number=number)
        return dspy.Prediction(answer=prediction.answer)

From here, you can specify your inputs alongside a number of expected outputs and produce a list of expected responses.

Pro tip - since we are imposing this constraint on the LLM itself, they are subject to errors in generating more/less than the expected number (or even duplicated answers), and here's where you can make use of Suggest and/or Assert to check for the list's length matching your InputField number.

Let me know if this helps!

chandlj commented 9 months ago

@arnavsinghvi11 Thanks for the help! I've noticed that this somewhat starts breaking down once you add more fields that are each dependent on each other. For example:

class BasicQA(dspy.Signature):
    """Return a list of specified number of possible answers to the question."""

    question = dspy.InputField()
    number = dspy.InputField(desc="number of possible answers to return")
    answer = dspy.OutputField(format=list, desc="unique possible answers")
    options = dspy.OutputField(format=list, desc="For each answer, a list of four possible options labeled A, B, C, and D.")

Using it in TestModule()(question="Who was a president of the United States?", number="5") results in a proper list of answers but a malformed options list. As the "schema" per se gets more complex I've noticed that DSPy has a harder time enforcing format. I'm sure that with few-shot examples this would be a bit better, but this is something LangChain handles pretty cleanly out-of-the-box. Basically, I'm looking for the optimizations and prompt-building that DSPy provides with some of the stronger output parsing and type-enforcing that something like Outlines or LangChain provides.

It would potentially be helpful to specify "composite signatures", like below, in order to build more complex signatures and types. The implicit schema here is that the LLM would respond with a list of BasicQA signatures:

class ListOfBasicQA(dspy.Signature):
     question = dspy.InputField(...)
     outputs = dspy.OutputField(format=list, base_signature=BasicQA)

okhat commented 9 months ago

Hey @chandlj ,

This is a great point, I'm also interested in "the optimizations that DSPy provides with stronger output parsing and type-enforcing".

I think LangChain et al provide this now through function calling APIs? These are quite restrictive in general. We'd rather use prompt -> completion interfaces. I think the best projects in this sphere would be SGLang or Outlines. We'd been wanting to allow DSPy signatures to have multiple backends, like SGLang or Outlines (or even function calling for that matter).

That's something we need help to do, let us know if you want to explore it.

chandlj commented 9 months ago

@okhat Yes, I believe that LangChain has the option for using OpenAI's function calling API, but I don't know if it uses it as the default. In my experience I don't think it does, because passing in a Json formatter's parse instructions can still return malformed JSON unless you enforce the model to be in JSON Mode, which is obviously not a universal thing (although the models are still pretty good at returning JSON when you don't specify JSON mode explicitly).

I could help explore this. How are Signatures currently handling input/output parsing? My understanding is that Signatures get baked into the prompt, but outside of the prompt there isn't a ton of enforcing? Let me know what your progress in this space has been.

kimianoorbakhsh commented 6 months ago

Dear @okhat

I found out that DsPy has recently added Typed Predictors which resolves this issue. However, I tried using this new feature with a more complex structure like lists, and I am encountering this error. I was wondering if there is a solution for this.

from pydantic import BaseModel, Field

class Output(BaseModel):
    question: str = Field(...)
    answer: str = Field(...)

class ListModel(BaseModel):
    outputs: list[Output] = Field(...)

class ListGenerator(dspy.Signature):
    input: str = dspy.InputField()
    output: ListModel = dspy.OutputField()

predictor = dspy.TypedChainOfThought(ListGenerator)
prediction = predictor(input = "Some String ...")

The error I get is as follows:

ValueError: ('Too many retries trying to get the correct output format. Try simplifying the requirements.', {'output': 'ValueError("Don\'t write anything after the final json 
")'})

mlederbauer commented 6 months ago

@kimianoorbakhsh I have encountered a similar issue when using Typed Predictors. What helped for me is switching the language models (to something more "capable") and simplifying the complexity of the outputs. This could mean, for example, adding the "desc" parameter in Field(...) to add context or deleting too many output fields.

However, this has been trial-and-error and I would also be very interested to know if there is a different resolution to this!

AriMKatz commented 6 months ago

Does dspy use json mode or function calling? Can dspy automatically add multi shot examples for the schema? Also for open weight models an sglang backend would solve this as well.