Open theoky opened 11 months ago
I've just found this comment in the relevant source file:
:param update_existing_embeddings: Not used by QdrantDocumentStore, as all the points
must have a corresponding vector in Qdrant.
So for my use case:
using update_embeddings does not work.
So a working use case would be
So update_embeddings is basically useful only when I change the model generating the embeddings? This seems somehow a little bit against the intent of having a simple pipeline, at least to me.
I'm using qdrant-haystack 1.0.11 with farm-haystack==1.21.2 and python 3.10.13 on Win10 and Qdrant running in Docker.
When updating the embeddings of a document store, document_store.update_embeddings seems to update all embeddings even when update_existing_embeddings is set to False.
I'm running this code:
After the execution the QDrant database contains 50 vectors, as expected.
I would also expect that
update_embeddings(False)
is running significantly faster thanupdate_embeddings(True)
, but both statements run for nearly the same time:Execution with update: 22.15771689999383, with no update: 20.913242900016485
To me this looks like
update_embeddings(..., update_existing_embeddings=False)
is updating the embeddings, too.What am I missing?