k0sproject / k0s

k0s - The Zero Friction Kubernetes
https://docs.k0sproject.io
Other
3.39k stars 353 forks source link

CRDT based Control Plane State #1625

Open malarinv opened 2 years ago

malarinv commented 2 years ago

Is your feature request related to a problem? Please describe.

Etcd being a strongly consistent kv store, it compromises on availability for edge use-cases, A lightweight eventually consistent control-plane state storage based on CRDTs could make k0s much more scalable and efficient.

Describe the solution you would like

Well described in this paper https://dl.acm.org/doi/pdf/10.1145/3434770.3459730

Describe alternatives you've considered

All other solutions aren't as performant as the CRDT based solution

Additional context

This could be an unique/special feature of k0s :wink: I am not the author of the paper just found it to be an interesting read which might improve k0s/k3s. So here I am.

jnummelin commented 2 years ago

This would be super cool indeed. But on the otherside it's a LOT of work too. :)

Essentially this would mean that there needs to be etcd API in front of some CRDT implementation, much like kine provides etcd API on top of SQL DBs.

Etcd being a strongly consistent kv store, it compromises on availability for edge use-cases

Can you share bit more insights for your use case? Remember that we can run k0s in a way where the controlplane is fully isolated from the workers. For edge cases, for example, you could run control-plane in some public cloud and the workers on your edge network. This way we can run controlplane in a more "stable" environment.

mikhail-sakhnov commented 2 years ago

@malarinv thanks for sharing the paper, it's been interesting reading.

One of the options to get rid of the etcd is to use sqlite via kine storage mode (https://docs.k0sproject.io/v1.23.5+k0s.0/configuration/?h=kine#specstorage) + sync over third-party provider, like litestream.

malarinv commented 2 years ago

Can you share bit more insights for your use case? Remember that we can run k0s in a way where the controlplane is fully isolated from the workers. For edge cases, for example, you could run control-plane in some public cloud and the workers on your edge network. This way we can run controlplane in a more "stable" environment.

I would like my edge cluster(containing a control plane node as well) to be fully functional in the event of network partition and the public cloud hosted control-plane nodes are no longer accessible.