garloff commented 10 months ago

As a cluster administrator, I want to be able to identify all IaaS resources that belong to a cluster, so I know what to save and what I can possibly safely delete when cleaning up.

Looking at the various resources at the OpenStack level, we do not consistently prefix everything with (per mgmthost) prefix and clustername. Here's what we find:

MgmtServer:

VM: $PREFIX-mgmtcluster (OK)
Root Volume: Unnamed (Not Good)
AppCred: $PREFIX-appcred (OK)
Port: $PREFIX-port (OK)
Network: $PREFIX-net (OK)
Subnet: $PREFIX-subnet (OK)
FloatingIP: Unnamed (but that can't be fixed -- we could assign a tag ... or at least pu a description)
SGs: $PREFIX-allow-icmp/ssh, $PREFIX-mgmt (OK)
Router: $PREFIX-rtr (OK)
Images: ubuntu-capi-image-v1.xx.x (We want this, as these should be shared for performance and cost)

Cluster

VMs: $PREFIX-$CLUSTER_NAME-md/control-plane (OK)
Root Volumes: $PREFIX-$CLUSTER_NAME-*-root (OK)
AppCred: $PREFIX-$CLUSTER_NAME-appcred (OK)
Ports: $PREFIX-$CLUSTER_NAME-md/control-plane-* (OK)
Networks: k8s-clusterapi-cluster-$CLUSTER_NAME-$CLUSTER_NAME (Not good: Lacks $PREFIX)
Subnets: dito (Not good)
SGs: $PREFIX-$CLUSTER_NAME-cilium (OK), k8s-cluster-$CLUSTER_NAME-$CLUSTER_NAME-secgroup-worker/controlplane (Not good: Lacks $PREFIX)
Router: k8s-clusterapi-cluster-$CLUSTER_NAME-$CLUSTER_NAME (Not good: Lacks $PREFIX)
Floating IP (kubeapi): Unnamed (we can't fix that. The description has the $CLUSTER_NAME, but not the $PREFIX, cloud also use a tag)
Loadbalancers (kubeapi): k8s-clusterapi-cluster-$CLUSTER_NAME-$CLUSTER_NAME-kubeapi (not good: Lacks $PREFIX)
Ingress
Floating IP (ingress): Unnamed (can't be fixed. The description has the $CLUSTER_NAME inside, but not the $PREFIX. Could also use a tag)
Loadbalancer (ingress): kubeservice$CLUSTER_NAME_ingress* (Not good: Lacks $PREFIX)
SGs (ingress): lb-sg-UUID-ingress* (Not good: Lacks $PREFIX and $CLUSTER_NAME, though the latter finds itself in the description)
Persistent Volumes
Volumes (PVs from CSI): pvc-UUID (Not good: Lacks $PREFIX and $CLUSTER_NAME -- and $VOLUME_NAME)

I may have overlooked some resource.

So, we should look at

mgmtcluster root volume (terraform)
FIP description/tag to mgmtcluster
Networks and Subnets pre cluster (lack $PREFIX)
Worker/Controlplane SGs (lack $PREFIX)
Same for per-cluster router
FIPs for kubeapi and ingress: description/tag should container $PREFIX and $CLUSTER_NAME
LBs for kubeapi and ingress lack $PREFIX in name
SGs (ingress) lack $PREFIX and $CLUSTER_NAME
PVs lack everything ...

I am not sure that all of this can easily be addressed without hacking OCCM/CSI. But then we maybe should do that ...

garloff commented 10 months ago

I should mention that providing consistency here would make the make fullclean option a lot better and safer. (See #484, #492).

fdobrovolny commented 10 months ago

Networking

The CAPO has in its networking module following:

const (
    networkPrefix string = "k8s-clusterapi"
    trunkResource string = "trunks"
    portResource  string = "ports"
)

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/cc304b8af1d5f637419273b0702a0311b3c17b01/pkg/cloud/services/networking/service.go#L32

Router

Then the router name is obtained by:

func getRouterName(clusterName string) string {
    return fmt.Sprintf("%s-cluster-%s", networkPrefix, clusterName)
}

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/cloud/services/networking/router.go#L286C1-L288C2

Subnet

The network and subnet name is obtained by:

func getSubnetName(clusterName string) string {
    return fmt.Sprintf("%s-cluster-%s", networkPrefix, clusterName)
}

func getNetworkName(clusterName string) string {
    return fmt.Sprintf("%s-cluster-%s", networkPrefix, clusterName)
}

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/cloud/services/networking/network.go#L391C1-L397C2

Floating IP description

func GetDescription(clusterName string) string {
    return fmt.Sprintf("Created by cluster-api-provider-openstack cluster %s", clusterName)
}

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/utils/names/names.go#L23-L25

However, this description get used in multiple resources:

floating IP
trunk
port
router
loadbalancer
network

Floating IP Tags

    fpCreateOpts.FloatingNetworkID = openStackCluster.Status.ExternalNetwork.ID
    fpCreateOpts.Description = names.GetDescription(clusterName)

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/cloud/services/networking/floatingip.go#L50-L51

Here it would be probably possible to add tags.

Loadbalancer

The CAPO has the following in the load balancer module:

const (
    networkPrefix   string = "k8s-clusterapi"
    kubeapiLBSuffix string = "kubeapi"
)

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/cc304b8af1d5f637419273b0702a0311b3c17b01/pkg/cloud/services/loadbalancer/loadbalancer.go#L42C1-L45

And then the name is obtained by:

func getLoadBalancerName(clusterName string) string {
    return fmt.Sprintf("%s-cluster-%s-%s", networkPrefix, clusterName, kubeapiLBSuffix)
}

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/cc304b8af1d5f637419273b0702a0311b3c17b01/pkg/cloud/services/loadbalancer/loadbalancer.go#L569-L571

fdobrovolny commented 10 months ago

Security Groups

const (
    secGroupPrefix     string = "k8s"
    controlPlaneSuffix string = "controlplane"
    workerSuffix       string = "worker"
    bastionSuffix      string = "bastion"
    remoteGroupIDSelf  string = "self"
)

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/cloud/services/networking/securitygroups.go#L30-L36

func getSecControlPlaneGroupName(clusterName string) string {
    return fmt.Sprintf("%s-cluster-%s-secgroup-%s", secGroupPrefix, clusterName, controlPlaneSuffix)
}

func getSecWorkerGroupName(clusterName string) string {
    return fmt.Sprintf("%s-cluster-%s-secgroup-%s", secGroupPrefix, clusterName, workerSuffix)
}

func getSecBastionGroupName(clusterName string) string {
    return fmt.Sprintf("%s-cluster-%s-secgroup-%s", secGroupPrefix, clusterName, bastionSuffix)
}

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/cloud/services/networking/network.go#L391C1-L397C2

fdobrovolny commented 10 months ago

Ingress

In cloud provider for OpenStack in load balancer following setting is set:

Load balancer

const (
    servicePrefix                       = "kube_service_"
...

https://github.com/kubernetes/cloud-provider-openstack/blob/f6ca48ee768260444db6762447f1cd071f52b6fd/pkg/openstack/loadbalancer.go#L61

func (lbaas *LbaasV2) GetLoadBalancerName(_ context.Context, clusterName string, service *corev1.Service) string {
    name := fmt.Sprintf("%s%s_%s_%s", servicePrefix, clusterName, service.Namespace, service.Name)
    return cpoutil.CutString255(name)
}

https://github.com/kubernetes/cloud-provider-openstack/blob/f6ca48ee768260444db6762447f1cd071f52b6fd/pkg/openstack/loadbalancer.go#L586-L589

SecurityGroup

func getSecurityGroupName(service *corev1.Service) string {
    securityGroupName := fmt.Sprintf("lb-sg-%s-%s-%s", service.UID, service.Namespace, service.Name)
    //OpenStack requires that the name of a security group is shorter than 255 bytes.
    if len(securityGroupName) > 255 {
        securityGroupName = securityGroupName[:255]
    }

    return securityGroupName
}

https://github.com/kubernetes/cloud-provider-openstack/blob/f6ca48ee768260444db6762447f1cd071f52b6fd/pkg/openstack/loadbalancer.go#L415-L423

Floating IP

            floatIPOpts := floatingips.CreateOpts{
                FloatingNetworkID: svcConf.lbPublicNetworkID,
                PortID:            portID,
                Description:       fmt.Sprintf("Floating IP for Kubernetes external service %s from cluster %s", serviceName, clusterName),
            }

https://github.com/kubernetes/cloud-provider-openstack/blob/f6ca48ee768260444db6762447f1cd071f52b6fd/pkg/openstack/loadbalancer.go#L978-L982

fdobrovolny commented 10 months ago

Summary

I have not been able to find where PVCs get their names.
I have created a PR for updating our terraform.
I have identified that the names cannot be updated without changes to the core software.
The name generations in all cases rely on constants, so it should not be hard to make those constants configurable and likely to make them into upstream change.
The floating IP description could be changed and also be made configurable.
Adding tags to floating IP seems realistic as well.

chess-knight commented 10 months ago

I have not been able to find where PVCs get their names.

In the past, I created an issue yaook/k8s#465. It is not about PV names(the user specifies PVC names, and PV names look random(probably some request ID)) but their description and tags.

Each volume created by cinder CSI has description Created by OpenStack Cinder CSI driver. Also, the tag cinder.csi.openstack.org/cluster=<cluster_name> is added there. We already set this value here. So in the volume metadata we can see property like cinder.csi.openstack.org/cluster='testcluster'. The same happens with volume snapshots.

jschoone commented 10 months ago

Let's split this in multiple tasks

chess-knight commented 6 months ago

After the ClusterClass feature #600, control-plane resources also lack $PREFIX. In this issue, it was marked as OK, but now it should be:

VMs: $CLUSTER_NAME-control-plane (Not good: Lacks $PREFIX)
Ports: $CLUSTER_NAME-control-plane-* (Not good: Lacks $PREFIX)
Root Volume: ...

Because of this change, we had to also modify our full cleanup script, from grep $PREFIX-$CLUSTER_NAME to grep "$PREFIX-$CLUSTER_NAME\|$CLUSTER_NAME-control-plane" which is IMO not good.

Also, in the PR https://github.com/SovereignCloudStack/k8s-cluster-api-provider/pull/661#issuecomment-1842931890, the naming of control-plane resources will be reduced even more.

The naming of ClusterClass resources can be seen in the CAPI issue https://github.com/kubernetes-sigs/cluster-api/issues/9135 - we can also see, that machine deployment resources have no problem so far because, in the naming scheme, there is also {topology-name} which is in our case is "${PREFIX}-${CLUSTER_NAME}-md-0-no1".

The only way as far as I can see for now is to add a prefix to control-plane resources by changing the cluster name itself by adding of prefix there.

jschoone commented 2 weeks ago

I'll close this because it's related to KaaS v1 which will be deprecated after R7. Feel free to reopen, if I'm wrong

jschoone commented 3 days ago

I'll close this because it's related to KaaS v1 which will be deprecated after R7. Feel free to reopen, if I'm wrong

Just realized, I was wrong. We have the problem, that the name for the ingress-nginx loadbalancer is always the same and then it errors inside of openstack when there are more then one, related to this comment. @michal-gubricky please find out how to configure clusterName using Cluster API to avoid changing code upstream.

jschoone commented 2 days ago

Seems like this value will be enough to set.

michal-gubricky commented 2 days ago

Seems like this value will be enough to set.

I will take a look at it and test it.

michal-gubricky commented 2 days ago

Seems like this value will be enough to set.

I will take a look at it and test it.

After investigation, setting this variable should do the job. The cluster name is set to kubernetes by default via an environment variable. This can be seen, for example, by describing a pod of OCCM.

    Environment:
      CLOUD_CONFIG:  /etc/config/cloud.conf
      CLUSTER_NAME:  kubernetes

and can also be found in values.yaml

SovereignCloudStack / k8s-cluster-api-provider

Prefix all resources with PREFIX and CLUSTER_NAME #495

MgmtServer:

Cluster

Ingress

Persistent Volumes

Networking

Router

Subnet

Floating IP description

Floating IP Tags

Loadbalancer

Security Groups

Ingress

Load balancer

SecurityGroup

Floating IP

Summary