NVIDIA / nv-ingest

NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other enterprise documents into metadata and text to embed into retrieval systems.
Apache License 2.0
92 stars 42 forks source link

[FEA]: Embedding and VDB upload from Jsonl file #206

Open ChrisJar opened 3 weeks ago

ChrisJar commented 3 weeks ago

Is this a new feature, an improvement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Currently preventing usage

Please provide a clear description of problem this feature solves

I have a jsonl file with text snippets and corresponding ids that I want to embed and upload to Milvus through the Nv-Ingest python client and retrieve with LlamaIndex. It's important that the Ids are maintained through every step of this process.

Describe the feature, and optionally a solution or implementation and any alternatives

I would like a feature that would allow me to submit a jsonl file with an id field and a text field to the NV-Ingest client, which would embed and upload the text to the VDB while keeping the associated Id.

Additional context

No response