Coding-Crashkurse / ParentChild-Retriever

4 stars 1 forks source link

Error when using with index #1

Open adrianruchti opened 4 months ago

adrianruchti commented 4 months ago

Hello I am trying to insert the docs in the backend using the langchain index. Using your docstore I get:

024-05-20 19:13:24 2024-05-20 17:13:24,579 - INFO - Splitting documents... 2024-05-20 19:13:24 2024-05-20 17:13:24,579 - INFO - Inserting documents into PGVector Store. 2024-05-20 19:13:28 2024-05-20 17:13:28,630 - INFO - Store created successfully 2024-05-20 19:13:28 2024-05-20 17:13:28,630 - ERROR - An error occurred: Vectorstore vectorstore=<langchain_postgres.vectorstores.PGVector object at 0x7f65c6f96350> docstore=<src.db.postgres_parentchild.PostgresStore object at 0x7f65d4f9dc90> child_splitter=<langchain_text_splitters.character.RecursiveCharacterTextSplitter object at 0x7f65d4eea090> parent_splitter=<langchain_text_splitters.character.RecursiveCharacterTextSplitter object at 0x7f65c5116d50> does not have required method delete 2024-05-20 19:13:28 INFO: 172.18.0.8:56032 - "POST /rag/process_dropbox_local/postgres/testmatt HTTP/1.1" 500 Internal Server Error

` connection_string = f"{config('PGVECTOR_BASE_CONNECTION_STRING')}{selected_db}"

    vectorstore = PGVector(
        embeddings=embedding_config,
        collection_name=selected_collection,
        connection=connection_string,
        use_jsonb=True,
    )
    parent_child_retriever = ParentDocumentRetriever(
    vectorstore=vectorstore,
    docstore=PostgresStore(connection_string=connection_string),
    child_splitter=child_splitter,
    parent_splitter=parent_splitter,
    )
    logging.info("Store created successfully")

    index(
            documents,
            record_manager,
            parent_child_retriever,
            cleanup="full",
            source_id_key="source",
        )   

    logging.info("Documents loaded and embeddings stored `successfully.")`

    maybe your idea was to use it only for retrieval and not for insertion. I am looking for a method to split the insertion and retrieval process. Thanks for your great work.
Coding-Crashkurse commented 4 months ago

Hi Adrian, the PD-retriever allows you to add documents, but is not compatible with the indexing API unfortunately.

adrianruchti commented 4 months ago

Thank you. Nice solution anyway. hope to see more production fastapi rag implementations from your channel in the future.