stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy-docs.vercel.app/
MIT License
17.36k stars 1.33k forks source link

DSPy COPRO Tutorial for Hotpot QA doesn't work with VertexAI Gemini #1187

Closed marshmellow77 closed 3 months ago

marshmellow77 commented 3 months ago

The official COPRO signature optimizer tutorial does not work with Gemini.

Here the code:

import dspy
from dspy.datasets import HotPotQA
from dspy.evaluate import Evaluate
from dspy.teleprompt import COPRO

dataset = HotPotQA(
    train_seed=1, train_size=20, eval_seed=2023, dev_size=50, test_size=0
)

trainset, devset = dataset.train, dataset.dev
print(trainset)
print(devset)
gemini = dspy.GoogleVertexAI(model="gemini-1.5-flash")
dspy.configure(lm=gemini)

class CoTSignature(dspy.Signature):
    """Answer the question and give the reasoning for the same."""

    question = dspy.InputField(desc="question about something")
    answer = dspy.OutputField(desc="often between 1 and 5 words")

class CoTPipeline(dspy.Module):
    def __init__(self):
        super().__init__()

        self.signature = CoTSignature
        self.predictor = dspy.ChainOfThought(self.signature)

    def forward(self, question):
        result = self.predictor(question=question)
        return dspy.Prediction(
            answer=result.answer,
            reasoning=result.rationale,
        )

def validate_context_and_answer(example, pred, trace=None):
    answer_EM = dspy.evaluate.answer_exact_match(example, pred)
    return answer_EM

NUM_THREADS = 5
evaluate = Evaluate(
    devset=devset,
    metric=validate_context_and_answer,
    num_threads=NUM_THREADS,
    display_progress=True,
    display_table=False,
)
cot_baseline = CoTPipeline()

devset_with_input = [
    dspy.Example({"question": r["question"], "answer": r["answer"]}).with_inputs(
        "question"
    )
    for r in devset
]
evaluate(cot_baseline, devset=devset_with_input)
kwargs = dict(
    num_threads=64, display_progress=True, display_table=0
)  # Used in Evaluate class in the optimization process
teleprompter = COPRO(
    metric=validate_context_and_answer,
    verbose=True,
)
compiled_prompt_opt = teleprompter.compile(
    cot_baseline, trainset=devset, eval_kwargs=kwargs
)

And here the error message:

  File "/Users/xxx/.pyenv/versions/3.11.7/envs/venv-apd/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InvalidArgument: 400 Unable to submit request because candidateCount must be 1 but the entered value was 9. Update the candidateCount value and try again.

The issue is that Gemini only allows a candidate count of 1 (see also VertexAI documentation). COPRO calls Gemini with candidate count 10 by default (via the breadth parameter).

Note that setting breadth = 1 does not fix this issue because COPRO sometimes calls the LLM with the value of breadth (https://github.com/stanfordnlp/dspy/blob/main/dspy/teleprompt/copro_optimizer.py#L313) and sometimes with the value of breadth-1 (https://github.com/stanfordnlp/dspy/blob/main/dspy/teleprompt/copro_optimizer.py#L167). This should probably be fixed/considered in the COPRO optimizer.

However, there needs to be a mechanism that ensures that Gemini is never called with a candidate count <> 1. I will submit a PR for the same.