k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
27.99k stars 2.34k forks source link

Support embedded NATS as alternate cluster option to etcd #7451

Open bruth opened 1 year ago

bruth commented 1 year ago

Is your feature request related to a problem? Please describe.

Currently, embedded HA is supported only by etcd. With the option of embedded NATS that was added to Kine (as of v0.10.0/v0.10.1), NATS can be another option since it supports native clustering as well.

Describe the solution you'd like

Add native support for NATS as an alternative cluster option when doing --cluster-init.

Describe alternatives you've considered

There are no other native options, however, using external NATS configuration (when configuring the --datastore-endpoint), the nodes can be clustered without the k3s layer being aware that it is clustered. This provides HA/FT of the KV data, but k3s is unaware of this and not technically running in clustered mode.

Additional context

I plan on contributing this, but any guidance or things to be aware of is welcome!

gedw99 commented 1 year ago

Looking forward to seeing this

brandond commented 1 year ago

cc @rancher-max @cwayne18

rancher-max commented 1 year ago

This is cool and a great feature suggestion! Thank you!

I have some clarifying questions to determine how deep down the proverbial rabbit hole we should go:

  1. Is k3s expected to supply backup/restore functionality? a. Would this extend cluster-reset/cluster-reset-restore-path functionality? b. Would it be a new command? c. Does it follow nats' approach or is it done differently?
  2. Should an operator be able to run NATS in their cluster while also using it as the embedded datastore?
  3. Should NATS certs be rotated during manual certificate rotation? a. What is the expectation when an operator provides their own certs? Ref: https://docs.k3s.io/cli/certificate#using-custom-ca-certificates and specifically the note: etcd files are required even if embedded etcd is not in use.
brandond commented 1 year ago

Those are all good questions!

At the moment I see the embedded NATS as a replacement for sqlite only; while it is possible to host a multi-node cluster using the embedded NATS server, @bruth or someone on his team will need to provide instructions on how to set this up as I believe it requires a user-managed config file to accomplish.

If it is desired that K3s support multi-server clusters by managing the configuration and cluster membership, allow for backup/restore using the embedded NATS datastore, and all the other stuff that would provide complete parity with the embedded etcd datastore, I think that would also need to be driven by someone on the Synadia side.

gedw99 commented 1 year ago

i agree in that some Ops aspects need to be added or documented.

bruth commented 1 year ago

need to provide instructions on how to set this up as I believe it requires a user-managed config file to accomplish.

This can be accomplished programmatically without config files for this particular setup. The Kine integration relies on the NATS server package which makes all of the config options available to be configured.

Since this would be a k3s feature, we would likely need to add support for additional query params on the Kine endpoint to indicate "cluster-mode" for example. But that design can get worked out to prevent needing users to manually define config files. It should be opt-in if they want more control, but not required.

the other stuff that would provide complete parity with the embedded etcd datastore, I think that would also need to be driven by someone on the Synadia side.

That is the intent for sure and why I am looking for guidance to understand the scope of complete parity! I don't want to boil the ocean in one pass if there is too much, but this is a good first list.

  1. Is k3s expected to supply backup/restore functionality?

If this functionality sits behind an interface, then we can hook in NATS standard method of backing up stream/consumer state as well as restore. I will need to read up on what k3s does today to compare.

  1. Should an operator be able to run NATS in their cluster while also using it as the embedded datastore?

They certainly should be able to run an additional server/cluster in k3s itself independent of the embedded one if they choose to. They shouldn't need, however I could understand the argument that they don't want to mix k3s and application concerns or the potential for applications impacting the embedded server/cluster and prefer a clear boundary.

One could say the same about etcd, but one distinction with NATS is that with it's multi-tenancy support, the k3s/kine state and messaging would be completely isolated from any applications.

In terms of recommended approaches, have a set of use cases and/or considerations in whether to reuse the embedded cluster vs. running another container should be sufficient for people to make that decision.

  1. Should NATS certs be rotated during manual certificate rotation?

Based on the link it looks like k3s is temporarily shutdown to do the cert rotation? That would certainly work for NATS as well. Custom CAs can be set in NATS config as well.

bruth commented 1 year ago

Hey @VestigeJ, I saw you assigned this to yourself! Are you actively working on this or interested in collaborating?

VestigeJ commented 1 year ago

Hey @bruth I DM'd you back on your home Slack if you want to work together I'd be more than happy to. :)

VestigeJ commented 1 year ago

@bruth Did this get put onto a back burner on the Synadia side?

brandond commented 1 year ago

@VestigeJ I think we're waiting on

udf2457 commented 1 year ago

@VestigeJ if it has been put on a back-burner then it would be very unfortunate that @bruth chose to highlight it on a recent podcast.

brandond commented 1 year ago

@udf2457 that comment is probably best directed at @bruth himself, not anyone on the K3s team. NATS support is maintained by the Synadia folks.

bruth commented 1 year ago

@udf2457 This was a temporary back burner.. focus has been on the NATS 2.10 release the past couple months. The KINE PR works, but there are a couple remaining subtle recovery issues to address (likely tweaking a couple timeouts). Now that it is out, focus is shifting back and will have an update next week.

bruth commented 1 year ago

Hey folks, just giving a quick update so it doesn't get lost in the void again. I made some more progress today on the Kine PR (k3s-io/kine#194), including porting the client code to the new JetStream API. I am debugging a few remaining things, but planning to have it ready for review and merge early next week.

As it pertains to this issue, it will support HA mode without needing to change anything in k3s itself. This is a simpler option/better outcome IMO given how intertwined etcd as a dependency is (outside of kine).

Regarding backup/restore this can be achieve out-of-band using standard NATS utilities. If there is a strong desire to get them baked into k3s utilities, I am happy to move that along along.

bruth commented 1 year ago

Converted https://github.com/k3s-io/kine/pull/194 to ready for review. There are some final bits to clean up and testing a couple failure cases, but in a good spot. Docs will come in the next couple days.

brandond commented 11 months ago

Bumping this back out; embedded nats support is still disabled by build flag. We'll need to add -tags nats to the K3s build flags to enable this.

At the moment nats only supports external servers.

bruth commented 11 months ago

@brandond Other than documentation, what would be helpful to have this be supported in v1.29?

brandond commented 11 months ago

Docs would be good, and maybe get a PR open now to add the build flag so we can see what the current size impact is?

dereknola commented 11 months ago

Looks like it adds about 2MB to the K3s size. I'm seeing the binary go from 58MB to 60MB

derek@degion:~/rancher/k3s$ ls -lh ./dist/artifacts/
total 247M
-rwxr-xr-x 1 derek derek  60M Nov 16 09:54 k3s
VestigeJ commented 11 months ago

Testing note - stalled currently for December or January releases

m3nowak commented 2 weeks ago

@brandond Is this feature still planned?

brandond commented 2 weeks ago

Conformance tests need to pass first: