Hey @kacperlukawski
After a community member brought it up on our Discord that they were not able to use QdrantDocumentStore in YAML pipelines I did some investigating and found 2 issues, one of which is an easy fix and here is a colab to reproduce the issue(s) and some suggestions to resolve them:
You'll notice that there's an error that says: Nodes cannot use variadic parameters like *args or **kwargs in their __init__ function.
Hopefully, this will no longer be an issue once we move to Haystack v2. But, you can resolve this error by changing the init slightly. And it might be a good idea to include this yaml pipeline (which is a bare-bones pipeline) as a test in your test suite.
And now comes the less nice issue. You will notice another log that says: ValidationError: {'name': 'DocumentStore', 'type': 'QdrantDocumentStore', 'params': {'host': ':memory:', 'index': 'Document', 'embedding_dim': 512, 'recreate_index': True}} is not valid under any of the given schemas
Haystack YAML pipelines can only work with serializable objects. And some of the QdrantDocumentStore such as hnsw_config, optimizers_config, wal_config, quantization_config and init_from. One suggestion we had for this is providing a string to type dict. This way you can make it so that for the Haystack DocumentStore init you only pass the key of the type that you want to use as a string, and then the actual type is passed to the QdrantClient. The other option which is less ideal would be to remove these parameters until Haystack v2.
@TuanaCelik I guess all the parameters should already be serializable, as we also use Pydantic. I'll try to make things work, even with the current interface, except for the **kwargs.
Hey @kacperlukawski After a community member brought it up on our Discord that they were not able to use
QdrantDocumentStore
in YAML pipelines I did some investigating and found 2 issues, one of which is an easy fix and here is a colab to reproduce the issue(s) and some suggestions to resolve them:Nodes cannot use variadic parameters like *args or **kwargs in their __init__ function.
Hopefully, this will no longer be an issue once we move to Haystack v2. But, you can resolve this error by changing the init slightly. And it might be a good idea to include this yaml pipeline (which is a bare-bones pipeline) as a test in your test suite.ValidationError: {'name': 'DocumentStore', 'type': 'QdrantDocumentStore', 'params': {'host': ':memory:', 'index': 'Document', 'embedding_dim': 512, 'recreate_index': True}} is not valid under any of the given schemas
Haystack YAML pipelines can only work with serializable objects. And some of theQdrantDocumentStore
such ashnsw_config
,optimizers_config
,wal_config
,quantization_config
andinit_from
. One suggestion we had for this is providing a string to type dict. This way you can make it so that for the Haystack DocumentStore init you only pass the key of the type that you want to use as a string, and then the actual type is passed to the QdrantClient. The other option which is less ideal would be to remove these parameters until Haystack v2.