microsoft / rag-experiment-accelerator

The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluations using Azure Cognitive Search and RAG pattern.
https://github.com/microsoft/rag-experiment-accelerator
Other
196 stars 71 forks source link

Add source to pdf doc metadata #735

Open beandrad opened 2 months ago

beandrad commented 2 months ago

The doc["metadata"]["source"] is used to set the chunk filenames. We should set the property source for pdf documents generated using "unstructured".