[Feature] Stateful Application Failover Support

RainbowMango commented 2 weeks ago

Summary Karmada’s scheduling logic runs on the assumption that resources that are scheduled and rescheduled are stateless. In some cases, users may desire to conserve a certain state so that applications can resume from where they left off in the previous cluster.

For CRDs dealing with data-processing (such as Flink or Spark), it can be particularly useful to restart applications from a previous checkpoint. That way applications can seamlessly resume processing data while avoiding double processing.

This feature aims to introduce a generalized way for users to define application state preservation in the context of cluster-to-cluster failovers.

Proposal

[ ] state preservation (@Dyex719, #5116)

Iteration Tasks -- Part-1: Ensure scheduler skips clusters where triggers the failover

[x] API change: Comment use case for PurgeMode Immediately base on discussion on https://github.com/karmada-io/karmada/pull/5251#issuecomment-2425931916. (@RainbowMango)
[x] API change: Introduce PurgeMode to GracefulEvictionTask in ResourceBinding. (@mszacillo, #5816)
[x] Make changes to GracefulEvictCluster() to set PurgeMode during eviction process. (@XiShanYongYe-Chang, #5821)
[ ] Make changes to the RB application failover controller and CRB application failover controller to build eviction task for PurgeMode Immediately. (@mszacillo)
[x] Make changes to the taint controller to config eviction task when evicting ResourceBinding and evicting ClusterResourceBinding. (@XiShanYongYe-Chang, https://github.com/karmada-io/karmada/pull/5879)
- Note that: Here we can not guarantee PurgeMode Immediately works as expected, as at this time Karmada might can not talk to the member clusters due to a network break. Set PurgeMode with Graciously by default as a compromise. ??
[ ] Make changes to binding controller and cluster binding controller to cleanup works from cluster in eviction task and purge mode is immediately.
[ ] Double confirm if we need to make changes to the graceful eviction controller.

Iteration Tasks -- Part-2: state preservation and feed

[ ] API change: Introduce StatePreservation to PropagationPolicy. (See the API design here)
[ ] API change: Introduce PreservedLabelState to ResourceBinding. (See the API design here)
[ ] Make changes to RB/CRB application controller to build PreservedLabelState when triggering eviction.
[ ] Make changes to taint manager to build PreservedLabelState when triggering eviction.
[ ] Make changes to RB/CRB controller to feed the PreservedLabelState to new clusters(failover to).
[ ] Double confirm if we need to introduce a default label to distinguish the failover type.(Waiting for real-world use case).

Iteration Tasks -- Part-3: failover history The failover history might be optional as we don't rely on it. TBD: based on #5251

mszacillo commented 2 weeks ago

Looks great, thank you!

Could we add a checklist item to include a default failoverType label onto the resource that has been failed over?

RainbowMango commented 2 weeks ago

Could we add a checklist item to include a default failoverType label onto the resource that has been failed over?

I don't have a strong feeling that we do need it, because according to the draft design, you can declare the label name to whatever you expects. For instance, you can declare the label name with karmada.io/failover-flink-checkpoint. Then, you can configure the Kyverno with that label. Am I right?

RainbowMango commented 2 weeks ago

@mszacillo I'm trying to split the whole feature into small pieces, hoping more people could get involved and accelerate development.

For now, it's working in progress, but glad you noticed it, let me know if you have any comments or questions.

mszacillo commented 2 weeks ago

@RainbowMango I think that's a good idea, and having this feature available faster would be great. :)

Do you have a preference on who will be working on which task? If not I can pick up the introduction of PurgeMode to the GracefulEvictionTask today.

In addition, could we start a slack working group channel? Given the time differences, I think being able to have more rapid conversations on slack would improve the implementation pace.

mszacillo commented 2 weeks ago

I don't have a strong feeling that we do need it, because according to the draft design, you can declare the label name to whatever you expects.

That's true, we can simply declare our own label name for the use-case. In the case of a failover, it might be helpful to distinguish between cluster + application failovers, and only Karmada has the context. But perhaps I'm creating a use-case before it's even appeared.

RainbowMango commented 2 weeks ago

Do you have a preference on who will be working on which task? If not I can pick up the introduction of PurgeMode to the GracefulEvictionTask today.

Sure go for it! Assigned this task to you. I think you are the feature owner, it would be great if you could work on it :) Generally speaking, anyone can take the task without an assignment by leaving a comment here. The issue owner(it's me in this case) will assign it by adding the name to the end of the task.

RainbowMango commented 2 weeks ago

In the case of a failover, it might be helpful to distinguish between cluster + application failovers, and only Karmada has the context. But perhaps I'm creating a use-case before it's even appeared.

Yeah, the only benefit I can see is that it might help to distinguish failover types, but I think there is no rush to do it until there is a solid use case. I added a checklist item for this; we can revisit it later.

Double confirm if we need to introduce a default label to distinguish the failover type.(Waiting for real-world use case).

RainbowMango commented 1 week ago

Make changes to the RB application failover controller and CRB application failover controller to build eviction task for PurgeMode Immediately. (@mszacillo)

@mszacillo assigned this task to you according to the discussion on https://github.com/karmada-io/karmada/pull/5821#pullrequestreview-2438835388.

karmada-io / karmada

[Feature] Stateful Application Failover Support #5788