numaproj / numaplane

Control Plane for Numaproj
Apache License 2.0
13 stars 2 forks source link

Test 2 NumaflowControllerRollouts with "pause-and-drain" strategy #418

Closed juliev0 closed 4 hours ago

juliev0 commented 5 days ago

Summary

Paved Path team is adding the functionality for 2 (or potentially more) NumaflowControllerRollouts.

Their plan is for each Pipeline/MonoVertex to have the capability to update the version of Numaflow that it's on (if there's a new version, they will add another NumaflowControllerRollout).

Since "progressive" strategy isn't ready yet, it would be great if the code they're building can be used with "pause and drain" and I believe it should be. But this needs to be verified through testing.

Say there's: controller 0 controller 1 ISBServiceRollout on controller 0 PipelineRollout on controller 0 MVXRollout on controller 0

Say they update all 3 Rollouts to Controller Instance Annotation 1.

I believe ResourceNeedsUpdating() function should determine PPND/Progressive for all 3 Rollouts. For Pipeline and isbsvc, this means it should do the PPND strategy. For MonoVertex, it should do standard direct apply since there's no PPND strategy.

Since ISBServiceRollout reconciler will request that its Pipelines pause, and also the Pipeline will want to pause on its own, it will pause (drain) and then both the isbsvc and pipeline should be updated and the Pipeline should go back to "Running".

Please confirm that all of what I've written above in fact happens and if not, let me know.


Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

juliev0 commented 5 days ago

@dpadhiar whenever you're done with the artifact improvements, could I bother you to try this out?

juliev0 commented 5 days ago

cc @Krithika3 @mshakira

juliev0 commented 1 day ago

Thinking through what will happen as far as the reconciliation within Numaflow, there is this line which I believe marks a Pipeline Failed if the isbsvc it's using is on the other Numaflow Controller.

Let's say the front end updates both ISBServiceRollout and PipelineRollout in the same PR, which I believe they would.

Whether the iSBServiceRollout change is reconciled first or the PipelineRollout change is reconciled first is non-deterministic. Therefore, the Pipeline could momentarily enter a "Failed" state while they're mismatched. I don't think this should be a problem, however.

juliev0 commented 4 hours ago

Unfortunately, it looks like Derek, Vigith, and the front end team have dropped the idea of having 2 Numaflow Controller Rollouts in favor of making things simple for the front end team. So, we won't be doing this after all.