Open susht3 opened 2 months ago
To retrieve specific document nodes or all nodes from multiple files by specifying a file name, you can use a custom retriever that extends the BaseRetriever
class from the LlamaIndex library. This approach allows for both synchronous and asynchronous retrieval, enabling efficient data processing and flexibility in handling large datasets or real-time requirements.
Here's a basic example to guide you:
from llama_index.core.base.base_retriever import BaseRetriever
from llama_index.core.schema import QueryBundle, NodeWithScore
class MyRetriever(BaseRetriever):
def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
# Your synchronous retrieval logic here
pass
async def _aretrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
# Your asynchronous retrieval logic here
pass
# Initialize your retriever
my_retriever = MyRetriever()
# For synchronous retrieval
nodes = my_retriever.retrieve("Your query here")
# For asynchronous retrieval
nodes = await my_retriever.aretrieve("Your query here")
This setup allows you to customize the retrieval process, enabling you to specify file names or other parameters to fetch the desired nodes. Whether you choose synchronous or asynchronous methods depends on your application's specific needs, such as performance considerations or the requirement for real-time data processing.
For more detailed information and examples, you might want to explore the LlamaIndex repository and its documentation.
To continue the conversation, mention @dosu.
Probably use metadata filtering, assuming the file name/path is in the metadata of nodes you inserted Here's an example with pinecone https://docs.llamaindex.ai/en/stable/examples/vector_stores/pinecone_metadata_filter/?h=metadata+filter
Probably use metadata filtering, assuming the file name/path is in the metadata of nodes you inserted Here's an example with pinecone https://docs.llamaindex.ai/en/stable/examples/vector_stores/pinecone_metadata_filter/?h=metadata+filter
thank you,it looks simple and efficient,but it uses PINECONE_API_KEY,how to get this key? is it free?
Question Validation
Question
I have multiple files,can i retrieve specify document nodes or all nodes? like this: nodes = recursive_retriever.retrieve(query, "xx_name") which xx_name is my file_name