sdsc-ordes / kg-llm-interface

Langchain-powered natural language interface to knowledge-graphs.
Apache License 2.0
13 stars 1 forks source link

allow multiple chroma collections #20

Open cmdoret opened 8 months ago

cmdoret commented 8 months ago

We use a chroma collection (currently named test, should be named schema) to embed the ontology/schema of the knowledge graph. In many cases, there may be multiple layers of schema, or taxonomies / picklists which are potentially very large.

Storing all those layers in the same collection poses a problem, as large picklists / schemas will be over-represented, making it impossible to fetch terms from the smaller layers.

Langchain has an ensemble retriever and a merger retriever specifically to address this issue: It allows us to create multiple collections and fetch a predefined number of items from each collection based on a single query.

Objective: support multi-collecthion chroma via ensemble or merger retriever.

Requirements: