We need to build a serving layer (supporting HTTP 1 and 2) on top of the Numaflow engine to support both sync/async endpoints. The serving layer can also optionally run independently without Numaflow pipeline for low-latency endpoints, provided the processing logic is written using Numaflow Map SDK.
Example Use Cases
Model serving using inference graph (powered by Numaflow DAG) for complex models
Model execution for proxying traffic to simple models (without Numaflow) but models are interfaced using Numaflow Map SDK.
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
### Tasks
- [ ] https://github.com/numaproj/numaflow/pull/1765
- [ ] https://github.com/numaproj/numaflow/issues/1813
- [ ] https://github.com/numaproj/numaflow/issues/1981
- [x] update servesink to tonic 0.12
- [ ] https://github.com/numaproj/numaflow/issues/1980
- [ ] https://github.com/numaproj/numaflow/issues/1979
- [x] TTL for expiring tracking and store entries
- [ ] https://github.com/numaproj/numaflow/issues/1982
- [ ] https://github.com/numaproj/numaflow/issues/1857
- [ ] use published image in the container creation
- [ ] https://github.com/numaproj/numaflow/issues/1843
- [ ] UI to track messages for Serving
- [ ] https://github.com/numaproj/numaflow/issues/1842
- [ ] https://github.com/numaproj/numaflow/issues/1876
- [ ] integrate with Redis ISB
Summary
We need to build a serving layer (supporting HTTP 1 and 2) on top of the Numaflow engine to support both sync/async endpoints. The serving layer can also optionally run independently without Numaflow pipeline for low-latency endpoints, provided the processing logic is written using Numaflow Map SDK.
Example Use Cases
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.