When ingesting a file or directory (recursively or not), we're now checking if there is a .knowledge.json file present in the directory. It's structured like this:
This will add the defined k/v pairs as metadata to the documents in the vector store.
.knowledge.json files in nested directories will be merged (with override) with parent metadata files.
Notes
I went with .knowledge.json instead of .metadata.json because I felt like the latter could be too "common" and we'd run into conflicts. By default, we're including hidden files in the ingestion process, so .knowledge.json is not explicitly being ignored.
It's JSON with an explicit metadata entry so we can add additional fields for new features in the future, e.g. directory content descriptions, etc. which can be merged with dataset metadata for routing retrieval
Ref #118
When ingesting a file or directory (recursively or not), we're now checking if there is a
.knowledge.json
file present in the directory. It's structured like this:This will add the defined k/v pairs as metadata to the documents in the vector store.
.knowledge.json
files in nested directories will be merged (with override) with parent metadata files.Notes
.knowledge.json
instead of.metadata.json
because I felt like the latter could be too "common" and we'd run into conflicts. By default, we're including hidden files in the ingestion process, so.knowledge.json
is not explicitly being ignored.metadata
entry so we can add additional fields for new features in the future, e.g. directory content descriptions, etc. which can be merged with dataset metadata for routing retrieval