kubernetes-retired / etcdadm

[EOL] etcdadm is a command-line tool for operating an etcd cluster
Apache License 2.0
765 stars 135 forks source link

Scaleway support in etcd-manager operational #345

Closed Mia-Cross closed 1 year ago

Mia-Cross commented 1 year ago

My previous PR (#344) was made in order to support master instance groups with more than one instance, which I recently learned we're not really supposed to do. This new one only fixes a typo in a print and adds a few comments to the code but with it the support for Scaleway in etcd-manager is complete.

NB: When adding new etcd members, the newly created volumes can attach themselves to the wrong instance : for example the instance master-2 can have the volume etcd-2-main and the volume etcd-3-events attached and the master-3 would have the etcd-3-main and the etcd-2-events, should I fix that ? The instances will always have one main and one events volume, but sometimes they get mixed up.

k8s-ci-robot commented 1 year ago

Hi @Mia-Cross. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
hakman commented 1 year ago

@Mia-Cross Can you take a look at my approach in https://github.com/kubernetes-sigs/etcdadm/pull/334 for fixing attaching to the wrong instance? Does it help you in any way?

Mia-Cross commented 1 year ago

I fixed the problem by recovering a part of my old work in multi-masters : the way I do it is by adding the instance-group tag to the volume in kops. Then in etcd-manager I can look for volumes to attach with this tag and remove it when I look for volumes in Poll() Is it ok ?

hakman commented 1 year ago

I think the more correct approach would be to send that tag as VolumeNameTag instead. This way you can use VolumeTag for finding peers and and add VolumeNameTag when looking for the exact volume to mount.

config.VolumeTag = []string{
    fmt.Sprintf("%s=%s", scaleway.TagClusterName, b.Cluster.Name),
    fmt.Sprintf("%s=%s", scaleway.TagNameEtcdClusterPrefix, etcdCluster.Name),
}
config.VolumeNameTag = fmt.Sprintf("%s=%s", scaleway.TagInstanceGroup, instanceGroupName)

vs

config.VolumeTag = []string{
    fmt.Sprintf("%s=%s", scaleway.TagClusterName, b.Cluster.Name),
    fmt.Sprintf("%s=%s", scaleway.TagNameEtcdClusterPrefix, etcdCluster.Name),
    fmt.Sprintf("%s=%s", scaleway.TagNameRolePrefix, scaleway.TagRoleMaster),
    fmt.Sprintf("%s=%s", scaleway.TagInstanceGroup, instanceGroupName),
}
config.VolumeNameTag = fmt.Sprintf("%s=%s", scaleway.TagNameEtcdClusterPrefix, etcdCluster.Name)
hakman commented 1 year ago

/ok-to-test

Mia-Cross commented 1 year ago

I had a lot of racing problems if the name tag was only the instance group name so I added the name of the etcd cluster to make sure that the FindVolumes() function only returns one volume to attach :

config.VolumeTag = []string{
    fmt.Sprintf("%s=%s", scaleway.TagClusterName, b.Cluster.Name),
    fmt.Sprintf("%s=%s", scaleway.TagNameEtcdClusterPrefix, etcdCluster.Name),
    fmt.Sprintf("%s=%s", scaleway.TagNameRolePrefix, scaleway.TagRoleMaster),
}
config.VolumeNameTag = fmt.Sprintf("%s=%s-%s", scaleway.TagInstanceGroup, instanceGroupName, etcdCluster.Name)
Mia-Cross commented 1 year ago

@hakman I've made new changes according to your comments on my kops PR, it should be good now.

hakman commented 1 year ago

Nice. Thanks @Mia-Cross. Happy with the end result?

hakman commented 1 year ago

/lgtm

Mia-Cross commented 1 year ago

Yes ! Thanks for your advice :)

hakman commented 1 year ago

/approve

k8s-ci-robot commented 1 year ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman, Mia-Cross

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[etcd-manager/OWNERS](https://github.com/kubernetes-sigs/etcdadm/blob/master/etcd-manager/OWNERS)~~ [hakman] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment