Closed dominiklohmann closed 10 months ago
One design question: When we pause a pipeline, should that cut operators off from left to right in the pipeline and let them consume their input entirely to minimize buffer usage, or should it pause the pipeline instantly including and keep the buffers around?
It's a trade-off between reducing memory usage for paused pipelines or making pausing pipelines work instantly.
One design question: When we pause a pipeline, should that cut operators off from left to right in the pipeline and let them consume their input entirely to minimize buffer usage, or should it pause the pipeline instantly including and keep the buffers around?
NB: We need to revisit this more rigorously once we enter the territory of exactly-once processing.
It would be nice if we give operators a bit of time to "flush" whatever they can. At the same time, some operators simply block forever (e.g., sort
), so it would only matter when we have a "realtime" pipeline.
@rdettai raised the very valid point this morning that regardless of whether we want to distinguish between created/completed/stopped in the frontend, we should do so in the backend, because that gives us the choice in the frontend.
We want to allow pipelines to be paused and resumed through the frontend.
💯 Definition of Done