stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy-docs.vercel.app/
MIT License
13.97k stars 1.07k forks source link

ChainOfThought("question -> answer", n=5) produces only 1 answer. #550

Open JPonsa opened 4 months ago

JPonsa commented 4 months ago

I am running the example shown in https://dspy-docs.vercel.app/docs/building-blocks/modules for ChainOfThought. Instead of getting 5 answers, I got only one answer.

Moreover, when running on a jupyter notebook. Rerunning the same cell produces the same outcome. I need to restart the kernel to get a different response.

lm = dspy.OpenAI( api_base="http://localhost:11434/v1/", api_key="ollama", model="mistral", stop="\n\n", model_type="chat", ) dspy.settings.configure(lm=lm, temperature=0.7)

question = "What's something great about the ColBERT retrieval model?"

1) Declare with a signature, and pass some config.

classify = dspy.cChainOfThought("question -> answer", n=5)

2) Call with input argument.

response = classify(question=question)

3) Access the outputs.

response.completions.answer # len(response.completions.answer) -> 1 expected 5

Output: ['The ColBERT retrieval model is great because it combines the power of BERT, a pre-trained language model, with information retrieval techniques. This allows it to understand the context and meaning behind queries, and retrieve relevant documents more effectively compared to traditional retrieval models.']

JPonsa commented 4 months ago

The issue seems to be around Ollama and mistral and how the stop is handled; impacting n>5 processing. I get different behaviours using OllamaLocal(model="mistral") or dspy.OpenAI(api_key="ollama", model="mistral"). However, only gpt3.5 sees to work for ChainOfThought(n>1)

  import dspy
  from pydantic import BaseModel, Field
  from dspy.teleprompt import LabeledFewShot
  from dspy.functional import TypedPredictor

  class SytheticFactBaseModel(BaseModel):
      fact: str = Field(..., description="a statement")
      veracity: bool = Field(..., description="is the statement trie or false")

  class ExampleSignature(dspy.Signature):
      """Generate an example of a synthetic fact. Be creative"""

      fact: SytheticFactBaseModel = dspy.OutputField()

  # lm = dspy.OpenAI(
  #     api_base="http://localhost:11434/v1/",
  #     api_key="ollama",
  #     model="mistral",
  #     stop="\n\n",
  #     model_type="chat",
  # )

  # lm = dspy.OpenAI(model="gpt-3.5-turbo", stop="\n\n")

  lm = dspy.OllamaLocal(model="mistral", stop=["\n\n"])
  dspy.settings.configure(lm=lm, temperature=0.7)

  generator = TypedPredictor(ExampleSignature)
  examples = generator(config=dict(n=5)).completions.fact

  existing_examples = [
      dspy.Example(fact="The ski is green", veracity=False),
      dspy.Example(fact="Typed Dspy is cool!", veracity=True),
  ]

  trained = LabeledFewShot().compile(student=generator, trainset=existing_examples)
  augmented_examples = trained(config=dict(n=5)).completions.fact

-------- GTP3.5 turbo output --------

     ['{"fact": "Synthetic facts must be in a JSON format", "veracity": true}',
   '{"fact": "Synthetic facts should always be wrapped in a JSON object", "veracity": true}',
   '{"fact": "Synthetic facts should always be presented in a JSON object format", "veracity": true}',
   '{"fact": "Synthetic facts should always be formatted as a JSON object", "veracity": true}',
   '{"fact": "Synthetic facts should always be in the format of a JSON object", "veracity": true}']

-------- .OpenAI(api_key="ollama",model="mistral) output --------

     ['{\n"fact": "Artificial intelligence can create original compositions of music based on predefined parameters and styles.",\n"veracity": true\n}']

-------- OllamaLocal(model="mistral") output --------

   ['{\n"fact": "In the future, advanced AI systems will be able to generate creative and complex synthetic facts in a JSON format without human intervention.",\n"veracity": true\n}',
 '{\n"fact": "The mythical creature known as the \'Chupacabra\' is a cryptid that is said to inhabit parts of the Americas, particularly Puerto Rico and Mexico. It is described as having reddish-purple skin, large black eyes, and fangs or spines along its back.",\n"veracity": false\n}',
 '{\n"fact": "In the future, all JSON objects will have consistent formatting and begin and end with \'{\' and \'}\' respectively.",\n"veracity": true\n}',
 '{\n"fact": "In the future, all JSON objects must be properly formatted and enclosed within curly braces {}",\n"veracity": true\n}',
 '{\n"fact": "Artificial intelligence systems can\'t create or experience emotions, despite generating creative outputs.",\n"veracity": true\n}']
chris-cortner commented 3 months ago

Also seeing this against VLLM and TheBloke/Nous-Hermes-2-Mixtral-8x7B-DPO-AWQ

RamXX commented 3 months ago

I reported a somewhat related issue with ChainOfThoughtWithHint(), but I do get the proper output with ChainOfThough('question -> answer', n=3) using lm = dspy.OllamaLocal(model="openhermes:7b-mistral-v2.5-fp16", max_tokens=4000, timeout_s=240) dspy.settings.configure(lm=lm)

mikeedjones commented 2 months ago

Associated with https://github.com/stanfordnlp/dspy/pull/918 I believe

RamXX commented 2 months ago

I recommend you re-test with the latest version of Ollama. The issue I opened initially is now closed. Please see this: https://github.com/stanfordnlp/dspy/issues/749#issuecomment-2094875038