[DOC]: Add "blueprint" diagram and explain

NVIDIA / nv-ingest

NVIDIA Ingest is a set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other enterprise documents into metadata and text to embed into retrieval systems.

Apache License 2.0

41 stars 12 forks source link

@randerzander further, we should add descriptions as the following to make it easier to understand and digest how the architecture is coming together:

PDF Ingestion NIM microservices

nv-yolox-structured-image: A fine-tuned object detection model to detect charts, plots, and tables in PDFs.
Deplot: A popular community pix2struct model for generating descriptions of charts.
CACHED: An object detection model used to identify various elements in graphs.
PaddleOCR: An optical character recognition (OCR) model to transcribe text from tables and charts.
NVIDIA NeMo Retriever NIM microservices
nv-embedqa-e5-v5: A popular community base-embedding model optimized for text question-answering retrieval.
nv-rerankqa-mistral4b-v3: A popular community base model fine-tuned for text reranking for high-accuracy question answering.
For more information, see An Easy Introduction to Multimodal Retrieval-Augmented Generation.

NVIDIA / nv-ingest

[DOC]: Add "blueprint" diagram and explain #25

How would you describe the priority of this documentation request

Please provide a link or source to the relevant docs

Describe the problems in the documentation

(Optional) Propose a correction or improvement

PDF Ingestion NIM microservices