kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.58k stars 1.31k forks source link

Util Secret ParseSecretName not working correctly on clusterctl move #7936

Closed tidusete closed 6 months ago

tidusete commented 1 year ago

What steps did you take and what happened:

  1. Have a Management Cluster
  2. Create an Azure target cluster
  3. Launch clusterctl move --to-directory testing_backup --kubeconfig ~/.kube/$(management_cluster) -v6 --dry-run -n $(target_cluster) This is part of the output:
    No default config file available
    Moving to directory...
    Discovering Cluster API objects
    AzureClusterIdentity Count=1
    Machine Count=2
    AzureMachineTemplate Count=2
    ConfigMap Count=1
    MachineSet Count=1
    AzureMachine Count=2
    Cluster Count=1
    Secret Count=13
    KubeadmConfigTemplate Count=2
    MachineDeployment Count=2
    AzureCluster Count=1
    KubeadmConfig Count=2
    KubeadmControlPlane Count=1
    Total objects Count=31
    Excluding secret from move (not linked with any Cluster) name="cluster-identity-secret-<target_cluster>"

    As you can see, I would like to move the secret "cluster-identity-secret-"

What did you expect to happen: To not exclude the secret since its linked with the cluster that i'm doing the move. I would like to have on my testing_backup directory the secret resource cluster-identity-secret-" + all the files that are actually being moved.

Anything else you would like to add: Yes, I tried to debug a little bit and I believe the utility secretutil.ParseSecretName is not working properly since I'm always on the error condition. https://github.com/kubernetes-sigs/cluster-api/blob/d87a39c85f87f60858c578feaf5192d5679a140e/cmd/clusterctl/client/cluster/objectgraph.go#L434 I tried to check the util/secret/secret.go and I think that there is a misunderstood between the vars clusterName purposeSuffix and possibly the allSecretPurposes which is defined on the util/secret/consts.go https://github.com/kubernetes-sigs/cluster-api/blob/314d98e0141489e2ac7a2915c66628ad2f22d006/util/secret/secret.go#L58

Do we have a standard name convention for the resources that are being created? The reason is because when I launch kubectl --kubeconfig ~/.kube/$(management_cluster) get secrets -n $(target_cluster) I have the following output:

cluster-identity-secret-<target_cluster>          Opaque                    1      22h
<target_cluster>-ca                               cluster.x-k8s.io/secret   2      22h
<target_cluster>-cluster-identity-secret          Opaque                    1      58m
<target_cluster>-control-plane-xxxxx-azure-json   Opaque                    3      6h13m
<target_cluster>-control-plane-azure-json         Opaque                    3      22h
<target_cluster>-control-plane-xxxx              cluster.x-k8s.io/secret   2      22h
<target_cluster>-etcd                             cluster.x-k8s.io/secret   2      22h
<target_cluster>-kubeconfig                       cluster.x-k8s.io/secret   1      22h
<target_cluster>-proxy                            cluster.x-k8s.io/secret   2      22h
<target_cluster>-sa                               cluster.x-k8s.io/secret   2      22h

When I launch clusterctl move and i'm checking what I have inside the allSecretPurposes,clusterName,purposeSuffix. I have the following values using 2 different resources:

Using this secret cluster-identity-secret-<target_cluster>
clusterName "cluster-identity-secret"
purposeSuffix -> "<target_cluster>"
allSecretPurposes -> [kubeconfig ca etcd sa proxy apiserver-etcd-client]

Using this secret: <target_cluster>-cluster-identity-secret
clusterName <target_cluster>-cluster-identity
purposeSuffix secret
allSecretPurposes -> [kubeconfig ca etcd sa proxy apiserver-etcd-client]

Environment:

/kind bug [One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

killianmuldoon commented 1 year ago

Soft ownership, which uses the name-linking, isn't the easiest way to create the object graph for objects from providers. ownerReferences are used by clusterctl move for most objects AFAIK. Does the cluster-identity-secret have any ownerReferences?

tidusete commented 1 year ago

No, it doesn't have any relation. Maybe a better definition of a cluster-identity-secret could be really useful. I have been reading and trying to understand better the following documentation but I think is a little poor. https://capz.sigs.k8s.io/topics/multitenancy.html

killianmuldoon commented 1 year ago

It might be good to bring this up over at https://github.com/kubernetes-sigs/cluster-api-provider-azure for the real experts :slightly_smiling_face: You'll probably get better input on the definition of the secret - and the issues with clusterctl move over on that repo.

tidusete commented 1 year ago

I also was trying to follow this documentation but again no luck with it... https://release-1-2.cluster-api.sigs.k8s.io/clusterctl/provider-contract.html?highlight=clusterctl%20move#move

fabriziopandini commented 1 year ago

/triage support might be we can improve the error message above, but the scope of soft ownership detection has always been limited to cluster certificates, and this is re-enforced by https://github.com/kubernetes-sigs/cluster-api/blob/610b27ce5c884dd1c1221e80124f87dabd460e69/util/secret/secret.go#L63-L70 which accepts only well-known purposes/suffix for the secret name.

Also, as far as I remember the discussion about cluster identity:

And, given that we are talking about credential, there is also a security concern about including this info in the scope of clusterctl move --to-directory as suggested above.

Said that

clusterctl move offers some tools that can be used also in this case, like applying the force-move label to specific objects (the secret in this case), but this works only if the objects are in the same namespace being moved; more info in https://cluster-api.sigs.k8s.io/clusterctl/provider-contract.html#move.

Last but not least, please be aware that clusterctl move has been designed for the bootstrap/pivot scenario only (https://cluster-api.sigs.k8s.io/clusterctl/commands/move.html#warning-1), and for this use case we do not expect complex setup of cloud credentials

k8s-ci-robot commented 1 year ago

@fabriziopandini: The label(s) triage/support cannot be applied, because the repository doesn't have them.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/7936#issuecomment-1403787220): >/triage support >might be we can improve the error message above, but the scope of soft ownership detection has always been limited to cluster certificates, and this is re-enforced by https://github.com/kubernetes-sigs/cluster-api/blob/610b27ce5c884dd1c1221e80124f87dabd460e69/util/secret/secret.go#L63-L70 which accepts only when known purposes for the secret. > >Also, as far as I remember the discussion about cluster identity: >- a cluster identity can be shared by many cluster, so it cannot rely on naming conventions as we are doing for cluster certificates. >- a cluster identity can refer to secrets in any namespace > >And I would like to add to the table a security concern, about using clusterctl move --to-directory while moving a cloud identity, which seems to me something that should be carefully evaluated and possibly avoided. > >Said that > >clusterctl move offers some tools that can be used also in this case, like applying the force-move label to objects, but this works only if the objects is in the same namespace being moved; more info in https://cluster-api.sigs.k8s.io/clusterctl/provider-contract.html#move. > >Also, please be aware that clusterctl move has been designed for the bootstrap/pivot scenario only (https://cluster-api.sigs.k8s.io/clusterctl/commands/move.html#warning-1), and for this use case only one cloud credential is required Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
fabriziopandini commented 1 year ago

/triage accepted

k8s-triage-robot commented 9 months ago

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

fabriziopandini commented 7 months ago

/priority backlog

fabriziopandini commented 6 months ago

/remove-kind bug /close due to lack of updates; the explanation + alternatives is provided in https://github.com/kubernetes-sigs/cluster-api/issues/7936#issuecomment-1403787220

k8s-ci-robot commented 6 months ago

@fabriziopandini: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/7936#issuecomment-2058562122): >/remove-kind bug >/close >due to lack of updates; the explanation + alternatives is provided in https://github.com/kubernetes-sigs/cluster-api/issues/7936#issuecomment-1403787220 Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.