deepset-ai / haystack

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
https://haystack.deepset.ai
Apache License 2.0
17.72k stars 1.92k forks source link

DocumentStore deserialiation with from_dict creates a new class instead of calling from_dict on the DocumentStore #8204

Closed FHardow closed 3 months ago

FHardow commented 3 months ago

Describe the bug Components that deserialize a document store through to_dict do not call from_dict on the document store, but create a new instance of it. That can be wrong if the functionality differs. We found that problem while testing out the new IAM workflow in the OpenSearch integration together with a basic component. The OpenSearch components don't have that problems because they are calling the DocumentStore directly. Link can be found below.

Affected components are:

Please double check this list, it was only a quick code search from my side.

Error message No error message thrown, only found that problem while testing the IAM opensearch setup.

Expected behavior Components that serialize a document store should also deserialize it correctly.

Additional context Test setup that shows that a serialized OpenSearch document store is not deserialized correctly: https://github.com/deepset-ai/haystack-core-integrations/pull/972

To Reproduce Steps to reproduce the behavior

FAQ Check

System: