run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.53k stars 5.02k forks source link

[Bug]: Incorrect Return Type in EarningsCallTranscript.load_data() #16121

Open alaia0a opened 21 hours ago

alaia0a commented 21 hours ago

Bug Description

The load_data() function in the EarningsCallTranscript class currently returns a single Document object directly instead of a List[Document] as per its type hint.

Version

0.11.6

Steps to Reproduce

  1. Call the load_data() method.
  2. Try to pass the result to VectorStoreIndex.from_documents(): python
    documents = loader.load_data()
    index = VectorStoreIndex.from_documents(documents)
  3. The following error occurs: AttributeError: 'Document' object has no attribute 'get_doc_id'

Relevant Logs/Tracbacks

No response

dosubot[bot] commented 21 hours ago

Hey there, @alaia0a! I'm here to assist you with any bugs, questions, or contributions. Let's tackle this issue together!

The issue arises because the load_data() function in the EarningsCallTranscript class returns a single Document object instead of a List[Document] as per its type hint. This discrepancy causes an AttributeError when used with VectorStoreIndex.from_documents(), which expects a list of Document objects.

To fix this, modify the load_data() function to return a list containing the Document object:

def load_data(self) -> List[Document]:
    resp_dict, speakers_list = get_earnings_transcript(
        self.quarter, self.ticker, self.year
    )
    return [Document(
        text=resp_dict["content"],
        extra_info={
            "ticker": resp_dict["symbol"],
            "quarter": "Q" + str(resp_dict["quarter"]),
            "date_time": resp_dict["date"],
            "speakers_list": speakers_list,
        },
    )]

This change ensures that the load_data() method returns a list of Document objects, which is compatible with VectorStoreIndex.from_documents() [1].

To continue talking to Dosu, mention @dosu.