qdrant / qdrant-haystack

An integration of Qdrant ANN vector database backend with Haystack
Apache License 2.0
43 stars 12 forks source link

ValidationError: document_store.update_embeddings #33

Open willmsMwx opened 1 year ago

willmsMwx commented 1 year ago

Hello,

I wanted to implement the following example with Qdrant: https://haystack.deepset.ai/tutorials/15_tableqa

Initialisation of the DocumentStore: document_store = QdrantDocumentStore( ":memory:", index="document", embedding_dim=768 )

A list with Document(content=current_df, content_type="table", id=key) is created (read_tables()):

tables = read_tables(f"{doc_dir}/tables.json") document_store.write_documents(tables, index=document_index)

An error is thrown when updating the embeddings in the DocumentStore: retriever = TableTextRetriever( document_store=document_store ) document_store.update_embeddings(retriever=retriever)


File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/qdrant_haystack/document_stores/qdrant.py:372, in QdrantDocumentStore.update_embeddings(self, retriever, index, update_existing_embeddings, filters, batch_size, headers) 362 doc_generator = self.get_all_documents_generator( 363 index=index, 364 filters=filters, 365 batch_size=batch_size, 366 headers=headers, 367 ) 369 with tqdm( 370 total=document_count, position=0, unit=" Docs", desc="Updating embeddings" 371 ) as progress_bar: --> 372 for document_batch in get_batches_from_generator(doc_generator, batch_size): ... ValidationError: 2 validation errors for Document content str type expected (type=type_error.str) content instance of DataFrame expected (type=type_error.arbitrary_type; expected_arbitrary_type=DataFrame)