v2: replace LINSTOR events with node agents

piraeusdatastore / piraeus-ha-controller

High Availability Controller for stateful workloads using storage provisioned by Piraeus

Apache License 2.0

15 stars 8 forks source link

v2: replace LINSTOR events with node agents #14

Closed WanzenBug closed 2 years ago

WanzenBug commented 2 years ago

Replace the old HA controller with a new version based on listening directly to DRBD events.

If a node detects that it could promote a resource, if that resource should be primary on another node, that other node has failed. If that node has failed we temporarily mark it as unschedulable, then try to evict/delete all the consuming pods. Then we force-delete the volume attachment: the node might be offline.

There are some secondary additions to the new HA Controller, mainly around better handling of suspended resources:

If Pods are in terminating state, but a backing resource is suspended, that resource is "force-secondaried".
If a resource is in "force-io-error" mode but still has attached pods, those pods are evicted, too.

WanzenBug commented 2 years ago

@rck May I ask you to review this? It's basically a complete rewrite, so a lot of changes to review at once :/

I tried to split the commits (with the benefit of hindsight and a lot of rebasing), so you may want to look at them one by one.

rck commented 2 years ago

@rck May I ask you to review this? It's basically a complete rewrite, so a lot of changes to review at once :/

sure thing. can not guarantee that I can do it today, and I'm out of office tomorrow, but if I don't make it Thu feel free to nag me on Fri

WanzenBug commented 2 years ago

I would have wished we could have re-used existing components and not reimplement all the jsons structs and polling and what not

Can still do that, maybe something we want to discuss further. I only used the polling for initial prototyping, but noticed that is was working well enough. My main concern with reactor is missed events. with polling we always detect suspended resources, with events send by drbd-reactor I could see a scenario were reactor restarts (maybe it crashed?) and I don't get notified about some changed quorum status.