stanfordnlp / dspy

DSPy: The framework for programming—not prompting—language models
https://dspy.ai
MIT License
18.95k stars 1.45k forks source link

BootstrapFewShot compile fails with 'numpy.float64' object has no attribute 'split' Error #311

Closed geemi725 closed 8 months ago

geemi725 commented 9 months ago

I am working with a pipeline where I'm using my own retriever class instead of the dspy retrievers. The goal is to develop a RAG model to extract data from PDFs. However when I try to compile the teleprompter it fails. I am following the intro.ipynb notebook. Please help?

class RAG(dspy.Module):
    def __init__(self, k=3, create_db=True, chunk_size=200,
                 vectdb_path=None):
        self.k = k
        self.create_db = create_db
        self.vectdb_path = vectdb_path
        self.chunk_size = chunk_size

        self.retrieve = TextRetriever()

        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)

    def forward(self, question,filename):
        context = self.retrieve.forward(query=question,filename=filename)

        prediction = self.generate_answer(context=context, question=question,
                                          filename=filename)
        print(prediction.answer)

        return dspy.Prediction(context=context, answer=prediction.answer)

def validate_context_and_answer(example, pred):
    answer_EM = dspy.evaluate.answer_exact_match(example, pred)
    return answer_EM

teleprompter = BootstrapFewShot(metric=validate_context_and_answer, max_rounds=2)

compiled_rag = teleprompter.compile(RAG(), trainset=trainset)

ERROR: Screenshot from 2024-01-31 10-02-14 Screenshot from 2024-01-31 10-03-13 Screenshot from 2024-01-31 10-04-28 Screenshot from 2024-01-31 10-05-08

arnavsinghvi11 commented 9 months ago

Hi @geemi725 ,

Thanks for the question! This error could be likely because your specified filename is of type 'numpy.float64' while it should be in fact str type. Could you confirm your input field values and types?