NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other enterprise documents into metadata and text to embed into retrieval systems.
Apache License 2.0
92
stars
42
forks
source link
[FEA]: Add new SimpleMessageBroker and supporting elements to NV-ingest #226
Is this a new feature, an improvement, or a change to existing functionality?
New Feature
How would you describe the priority of this feature request
Currently preventing usage
Please provide a clear description of problem this feature solves
Overview:
Currently, the ingest pipeline relies on a message broker (Redis by default) to feed data, requiring the deployment of a message broker container and front-end REST service. For testing or proof of concept scenarios, it would be beneficial to have a more streamlined option that eliminates these dependencies.
Solution:
Implement a simple inline message broker that can be used within the pipeline and create a corresponding client interface. This will allow the ingest service to run independently, without requiring local dependencies on an external message broker or REST service.
Describe the feature, and optionally a solution or implementation and any alternatives
Introduce socket_task_source and socket_task_sink components to the nv_ingest service. These will be configurable to listen on a specified source, allowing jobs to be accepted and results returned over sockets. Additionally, update nv_ingest_client with options to submit jobs and fetch job results from the nv_ingest service via these socket connections.
Is this a new feature, an improvement, or a change to existing functionality?
New Feature
How would you describe the priority of this feature request
Currently preventing usage
Please provide a clear description of problem this feature solves
Overview: Currently, the ingest pipeline relies on a message broker (Redis by default) to feed data, requiring the deployment of a message broker container and front-end REST service. For testing or proof of concept scenarios, it would be beneficial to have a more streamlined option that eliminates these dependencies.
Solution: Implement a simple inline message broker that can be used within the pipeline and create a corresponding client interface. This will allow the ingest service to run independently, without requiring local dependencies on an external message broker or REST service.
Describe the feature, and optionally a solution or implementation and any alternatives
Introduce socket_task_source and socket_task_sink components to the nv_ingest service. These will be configurable to listen on a specified source, allowing jobs to be accepted and results returned over sockets. Additionally, update nv_ingest_client with options to submit jobs and fetch job results from the nv_ingest service via these socket connections.
Additional context
No response