Open Prem-Nitin opened 1 month ago
To ensure that updates to the metadata and text for some chunks are reflected in both the nodes in the Pinecone index and the parent nodes in the MongoDB docstore, you need to perform the following steps:
upsert
method.Here is an example of how you can achieve this:
def update_nodes_in_pinecone_and_mongodb(pinecone_vector_store, mongodb_docstore, nodes):
# Update nodes in Pinecone index
ids = []
entries = []
for node in nodes:
node_id = node.node_id
metadata = node_to_metadata_dict(
node,
remove_text=pinecone_vector_store.remove_text_from_metadata,
flat_metadata=pinecone_vector_store.flat_metadata,
)
entry = {
ID_KEY: node_id,
VECTOR_KEY: node.get_embedding(),
METADATA_KEY: metadata,
}
if pinecone_vector_store.add_sparse_vector and pinecone_vector_store._tokenizer is not None:
sparse_vector = generate_sparse_vectors(
[node.get_content(metadata_mode=MetadataMode.EMBED)],
pinecone_vector_store._tokenizer,
)[0]
entry[SPARSE_VECTOR_KEY] = sparse_vector
ids.append(node_id)
entries.append(entry)
pinecone_vector_store._pinecone_index.upsert(
entries,
namespace=pinecone_vector_store.namespace,
batch_size=pinecone_vector_store.batch_size,
**pinecone_vector_store.insert_kwargs,
)
# Update parent nodes in MongoDB docstore
for node in nodes:
node_id = node.node_id
metadata = node_to_metadata_dict(
node,
remove_text=pinecone_vector_store.remove_text_from_metadata,
flat_metadata=pinecone_vector_store.flat_metadata,
)
mongodb_docstore.update_document(
node_id,
{"metadata": metadata, "text": node.get_content()}
)
return ids
In this example, pinecone_vector_store
is an instance of PineconeVectorStore
, mongodb_docstore
is an instance of MongoDocumentStore
where the parent nodes are stored, and nodes
is the list of nodes to be updated. The function update_nodes_in_pinecone_and_mongodb
updates the nodes in both the Pinecone index and the MongoDB docstore [1][2][3][4].
To continue talking to Dosu, mention @dosu.
Question Validation
Question
I used hierarchical parser for chunking and using pinecone as my vector store and mongodb for docstore. Now i want to update metadata and text for some chunks. How to do that so it is reflected on both nodes in pinecone index as well as the parent nodes in mongodb docstore.