stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy-docs.vercel.app/
MIT License
14.64k stars 1.13k forks source link

long_text attribute for retriever passages #166

Open andreapiso opened 9 months ago

andreapiso commented 9 months ago

Hi, I am creating a custom retriever, inheriting from dspy.Retrieve and overloading the __init__ and forward method.

Right now I am receiving an error on "long_text" not being present in the passages that my retriever is generating:

File ~/miniconda3/lib/python3.10/site-packages/dsp/primitives/search.py:10, in <listcomp>(.0)
      8     raise AssertionError("No RM is loaded.")
      9 passages = dsp.settings.rm(query, k=k, **kwargs)
---> 10 passages = [psg.long_text for psg in passages]
     12 if dsp.settings.reranker:
     13     passages_cs_scores = dsp.settings.reranker(query, passages)

AttributeError: 'str' object has no attribute 'long_text'

I am confused because I assumed that passages would be a list of strings, but it does not look like it's the case. However, when I look at the pinecone retriever, which is what I am using as a reference to implement mine, It does not look like it is using the "long_text" field either.

https://github.com/stanfordnlp/dspy/blob/main/dspy/retrieve/pinecone_rm.py

What am I missing?

wicusverhoef commented 9 months ago

I am encountering the same.

jamesliu commented 9 months ago

Same here

cyyeh commented 5 months ago

@okhat Could you help answer this question? Thanks a lot!

vaaale commented 5 months ago

Just had the same issue. This fixed it:

wiki_abstracts = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts') dspy.settings.configure(lm=ollama_model, rm=wiki_abstracts)

ac-sagarmathpal commented 4 months ago

Facing the same issue

koshyviv commented 4 months ago

I was able to resolve by adding

if hasattr(passages, 'passages'):
    passages = passages.passages

after this line https://github.com/stanfordnlp/dspy/blob/42a5943379d28d1673dc8fe332a3d596efdfc7a3/dsp/primitives/search.py#L12

It seems that we are getting a prediction object as the return of passages = dsp.settings.rm(query, k=k, **kwargs)

fabiannagel commented 4 months ago

I don't know why, but retrieve.py expects a list of dictionaries from the custom retriever. This works for me:

    def forward(self, query_or_queries: Union[str, List[str]], k: Optional[int] = None) -> dspy.Prediction:
        context = ['foo', 'bar']
        return [dotdict({"long_text": passage}) for passage in context]
jettro commented 4 months ago

I think there are a few parts important for this to work:

A change is need to use the Prediction object in the retrieve function, and all Retrievers should return this object. Or we mist change the return type.

I did not check what other functions use this forward method. What I have seen sofar, we do not want to change the interface, so improving the Retrievers and the retrieve function feels more logical.

fireking77 commented 3 months ago

Is there a abstract class or object for this passage in the dspy framework? ;) it would be nice, becuase when we implement the custom retriver we can get the expected signature from the framework so...what I want to say if the framework (DSPy) works with this Passage API like this:

class Passage(ABC): long_text: str

then this should be define herem and external tools, framwroks can interface with this over an adapter if neccesary ;) in my case I would just wire it :)

anyway, thanks for the info...I think i can fix this issue on my end

ethanniser commented 1 week ago

I just ran into this as well The examples the docs give for a custom RM lead to this exact issue