rancher / rke2

https://docs.rke2.io/
Apache License 2.0
1.52k stars 265 forks source link

Hard to distinguish manual vs recurring snapshot in an RKE2 cluster #2695

Open sowmyav27 opened 2 years ago

sowmyav27 commented 2 years ago

Rancher Server Setup

Information about the Cluster

User Information

Describe the bug Hard to distinguish manual vs recurring snapshot in an RKE2 cluster

To Reproduce

Manual snapshot: sowmya-etcd-2-on-demand-sowmya-etcd-2-pool1-b66908a4-8nhj-7b289 Recurring snapshot: sowmya-etcd-2-etcd-snapshot-sowmya-etcd-2-pool1-b66908a4-e06095

Can we rename the snapshots to make it easier for users ? Can we have recurring in the name of any recurring snapshots?

snasovich commented 2 years ago

@katran001 @briandowns @cwayne18 , this is RKE2 issue per @Oats87 . Could you please look into improving this?

brandond commented 2 years ago

The manual ones already have on-demand in them, don't they?

briandowns commented 2 years ago

The manual snapshots will contain the string "on-demand" and the scheduled snapshots will not. There is the flag --etcd-snapshot-name though allowing user defined names.

Oats87 commented 2 years ago

The issue I see with using --etcd-snapshot-name is in order for us to trickle down recurring into the snapshot names for recurring snapshots, we would more or less need to render it into the config file of RKE2. While at face value this is a simple operation, it poses problems for if a user wants to manually run rke2 etcd-snapshot save as they will inherit the config file on disk and be manually creating snapshots that have recurring in the name.

The only other thing I can think of for this is to manually render the specific --etcd-snapshot-name argument into the service arguments in the systemd unit, but that is not ideal from a future-maintenance perspective (it's confusing and feels like a workaround). Furthermore, the current iteration of the RKE2 install script does not support setting arbitrary arguments for the service (unlike the K3s install script). Not that v2prov currently supports RPMs, but the RPMs also bring their own systemd unit that (the planner) should not be touching.

I would rather be able to set a specific CLI argument for recurring etcd snapshot names, to help distinguish between the two. Alternatively, if the name can be pulled from an environment variable, theoretically the planner could render the corresponding environment variable file for the service, although this also feels like quite a workaround and opens us up to a lot of potential for regression on the v2prov side.

https://github.com/k3s-io/k3s/blob/313aaca547f030752788dce696fdf8c9568bc035/pkg/cli/cmds/etcd_snapshot.go#L29-L34 for reference

snasovich commented 2 years ago

@Oats87 , thank you for the detailed summary.

@briandowns @brandond @cwayne18 @katran001 , would be great if this enhancement can be implemented in k3s/RKE2. Though I see it as more of a "nice to have" at this time, so not a high priority.

briandowns commented 2 years ago

This might be a nice first task for one of the new folks on the team. I'll get it on next up.