stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy-docs.vercel.app/
MIT License
17.45k stars 1.33k forks source link

ReAct module continues moving forward despite getting an answer #748

Closed milosgajdos closed 5 months ago

milosgajdos commented 6 months ago

I've grabbed the sample code in the docs for ReAct and tried running it.

I discovered 2 things:

Here's the code:

import dspy

def main():
    # setup ollama client to interact with llama2
    llama2 = dspy.OllamaLocal(model="llama2")
    dspy.configure(lm=llama2)

    wiki_abstracts = dspy.ColBERTv2(url="http://20.102.90.50:2017/wiki17_abstracts")
    dspy.configure(rm=wiki_abstracts)

    # Define a simple signature for basic question answering
    class BasicQA(dspy.Signature):
        """Answer questions with short factoid answers."""

        question = dspy.InputField()
        answer = dspy.OutputField(desc="often between 1 and 5 words")

    # Pass signature to ReAct module
    react_module = dspy.ReAct(BasicQA)

    # Call the ReAct module on a particular input
    question = "Aside from the Apple Remote, what other devices can control the program Apple Remote was originally designed to interact with?"
    result = react_module(question=question)

    print(f"Question: {question}")
    print(f"Final Predicted Answer (after ReAct process): {result.answer}")

if __name__ == "__main__":
    main()

Here's the snippet of agent prompting history:

Prediction(
    Thought_1="Question: Aside from the Apple Remote, what other devices can control the program Apple Remote was originally designed to interact with?\n\nThought 1: Hmm, that's an interesting question. Let me think about it... (pauses) Based on my knowledge of Apple's history and product offerings, I believe the Apple Remote was originally designed to work with Apple's line of personal computers, such as the Macintosh computer. However, I wonder if there are any other devices that can control the program Apple Remote was designed for... (continues thinking)",
    Action_1='Search[query="other devices that can control Apple Remote"]'
)
=== ACTION: Search[query="other devices that can control Apple Remote"]
Prediction(
    Thought_2='Question: Aside from the Apple Remote, what other devices can control the program Apple Remote was originally designed to interact with?\n\nThought 1: Hmm, that\'s an interesting question. Let me think about it... (pauses) Based on my knowledge of Apple\'s history and product offerings, I believe the Apple Remote was originally designed to work with Apple\'s line of personal computers, such as the Macintosh computer. However, I wonder if there are any other devices that can control the program Apple Remote was designed for... (continues thinking)\n\nAction 1: Search[query="other devices that can control Apple Remote"]\n\nObservation 1',
    Action_2='Answer: There are several other devices that can control the program Apple Remote was originally designed to interact with, including:\n\n1. iTunes Remote: This is a software application developed by Apple for iOS devices that allows for remote control of Apple TV or iTunes library in an area with Wi-Fi connectivity using the proprietary Digital Audio Control Protocol (DACP).\n2. Apple Remote Desktop: This is a Macintosh application produced by Apple, first released on March 14, 2002, that allows users to remotely control or monitor other computers over a network.\n3. Siri Remote: This is the remote control device included with the fourth generation'
)
=== ACTION: Answer: There are several other devices that can control the program Apple Remote was originally designed to interact with, including:

As you can see the Action has reached an Answer but the code continues to split the output completely ignoring it. Suggestion: check if the answer has been returned by the agent before max_iters has been reached.

arnavsinghvi11 commented 6 months ago

Hi @milosgajdos , at the moment the ReAct module is configured to only halt at the stopping condition Finish[answer].

It seems like the model is hallucinating a bit from the output \n\nAction 1: Search[query="other devices that can control Apple Remote"]\n\nObservation 1', in Thought_2 which can lead to the corresponding steps to break as well. I suspect this is due to some incompatibilities with DSPy and chat models (which is currently in-progress to fix), and not due to the ReAct setup itself, but you can add on another tool to check if the Action has reached Answer as well and modify that stopping condition for now.

lmk if this helps or if you have any other suggestions! (ReAct is fairly experimental right now so we'd love any PRs to resolve issues like these!)

milosgajdos commented 5 months ago

Ah, my GH notifications got outta control. Thanks for the response!!

Interesting...I'm wondering, would the instruction optimiser be able to stir the model in the right direction? Or some DSPy assertion suggestion? Sorry about the dumb questions, I'm still very new to this framework

arnavsinghvi11 commented 5 months ago

Assertions (to validate outputs by checking for key words and/or maintaining correct formatting), few shot optimizers (bootstrapping unlabeled data to create agentic examples to learn from), instruction optimizers (better agent-based instructions), etc. are all great ways to improve agentic behavior!