tenzir / public-roadmap

The public roadmap of Tenzir
https://docs.tenzir.com/roadmap
4 stars 0 forks source link

Resumable Pipelines #68

Closed dominiklohmann closed 10 months ago

dominiklohmann commented 12 months ago

We want to allow pipelines to be paused and resumed through the frontend.

Pipeline State Machine

💯 Definition of Done

- [x] Design pausing and resuming for the frontend
- [x] Implement a mechanism for pausing and resuming pipelines
- [x] Expose pausing and resuming in the pipeline manager API
- [x] Implement the changes in the frontend
dominiklohmann commented 11 months ago

One design question: When we pause a pipeline, should that cut operators off from left to right in the pipeline and let them consume their input entirely to minimize buffer usage, or should it pause the pipeline instantly including and keep the buffers around?

It's a trade-off between reducing memory usage for paused pipelines or making pausing pipelines work instantly.

mavam commented 11 months ago

One design question: When we pause a pipeline, should that cut operators off from left to right in the pipeline and let them consume their input entirely to minimize buffer usage, or should it pause the pipeline instantly including and keep the buffers around?

NB: We need to revisit this more rigorously once we enter the territory of exactly-once processing.

It would be nice if we give operators a bit of time to "flush" whatever they can. At the same time, some operators simply block forever (e.g., sort), so it would only matter when we have a "realtime" pipeline.

dominiklohmann commented 10 months ago

@rdettai raised the very valid point this morning that regardless of whether we want to distinguish between created/completed/stopped in the frontend, we should do so in the backend, because that gives us the choice in the frontend.