kubernetes-sigs / cluster-api-provider-aws

Kubernetes Cluster API Provider AWS provides consistent deployment and day 2 operations of "self-managed" and EKS Kubernetes clusters on AWS.
http://cluster-api-aws.sigs.k8s.io/
Apache License 2.0
632 stars 552 forks source link

Controller can generate an ELB name that is too long #889

Closed ajwdev closed 4 years ago

ajwdev commented 5 years ago

/kind bug

What steps did you take and what happened:

I created a cluster with the name "this-is-too-long-right-1". CAPA provisioned much of the infrastructure (VPC, subnets, bastion host, etc) but fails creating an ELB with the following error:

I0710 20:36:00.861066       1 loadbalancer.go:36] [cluster-actuator]/cluster.k8s.io/v1alpha1/75a9e9e03c0751d8cc3612383c854fcbeb78af8d/this-is-too-long-right-1 "level"=2 "msg"="Reconciling load balancers"
E0710 20:36:00.892528       1 cluster_controller.go:146] Error reconciling cluster object this-is-too-long-right-1; failed to reconcile load balancers for cluster "this-is-too-long-right-1": unexpected aws error: ValidationError: LoadBalancer name cannot be longer than 32 characters

I believe this is because it attempts to create an ELB called "this-is-too-long-right-1-apiserver".

What did you expect to happen:

My expectation was that if the cluster name was legal according the Kubernetes API that it should be valid for all provisioning tasks. At minimum, I would expect a OpenAPI validation to reject my cluster object name.

Environment:

rudoi commented 5 years ago

The Cluster object can map to a multitude of underlying providers (AWS, Azure, etc), each of which probably has differing limitations on load balancer name length. For example, it's 32 in AWS but 80 in Azure. I don't know if it would be appropriate to have the schema limit the length to the lowest limit found in the providers.

Maybe we need some ValidatingWebhook action here to provide the most fluid UX? 🤔

ncdc commented 5 years ago

It might be better to have code here in CAPA that does something like a consistent hash of the cluster name that is <= 32 characters - that way we wouldn't introduce a provider-specific limitation on cluster name.

detiber commented 5 years ago

No objection to using a hash for resource naming, as long as we make sure that we still use the full name for tags and descriptions.

rudoi commented 5 years ago

Is there any way we can meet the length requirements while maintaining a user-friendly name? I don't think anyone is super thrilled with the hashed names that currently exist in the Kubernetes cloud providers.

ncdc commented 5 years ago

If the cluster's name is > 32 chars, we could use something like the first 24 + an 8-char hash of the cluster's name or possibly just random data (replace 24 & 8 with reasonable values to avoid collisions, but you get the idea).

rudoi commented 5 years ago

If the cluster's name is > 32 chars, we could use something like the first 24 + an 8-char hash of the cluster's name or possibly just random data (replace 24 & 8 with reasonable values to avoid collisions, but you get the idea).

Love it, ship it.

dlipovetsky commented 5 years ago

Another constraint for ELB names besides length and uniqueness within region: A CAPI cluster name must be a valid Kubernetes object name. But there might be characters that are valid in a Kubernetes object name, but invalid in some external API's name.

Valid Kubernetes names. (The following is an excerpt of the validation error returned by the apiserver):

DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character

And valid classic ELB names:

... must have a maximum of 32 characters, must contain only alphanumeric characters or hyphens, and cannot begin or end with a hyphen.

So, a cluster name can have the ( .) character, but that character cannot be used in the ELB name.

dlipovetsky commented 5 years ago

It might be better to have code here in CAPA that does something like a consistent hash of the cluster name that is <= 32 characters - that way we wouldn't introduce a provider-specific limitation on cluster name.

The provider implementation might need to translate names to identify external resources. And an end user might need to translate names to do the same thing, perhaps to do something "out of band" with those resources. I can't think of a reason that CAPA would need to translate names. So it doesn't seem like a problem that CAPA needs to solve.

Therefore, my intuition is to leave the translation to each provider.

However, it would be nice for the end user to get the translation from the provider. (Maybe nicest of all would be for the end user to get an enumeration of all the infrastructure resources created by the provider)

detiber commented 5 years ago

+1 on handling name mangling in the providers, since it allows each provider to handle their own particular needs without forcing all of those requirements into core cluster-api.

ncdc commented 4 years ago

/assign /lifecycle active /help cancel

ncdc commented 4 years ago

A few things:

If we make a change to the ELB naming convention, clusters created prior to the change will end up creating a 2nd ELB using the new naming convention, unless we come up with an upgrade strategy.

The current convention appends -apiserver, which takes up 10 of our 32 characters. Do we want to preserve this?

Thoughts?

detiber commented 4 years ago

I'm good with removing the -apiserver suffix, for a migration strategy, we could keep a fallback lookup based on the old naming convention.

ncdc commented 4 years ago

+1 to fallback

ncdc commented 4 years ago

I'm worried about the naming convention & possible collisions if we try to combine the cluster name with a truncated hash.

What if we did this? Generate a UUID, have the AWSCluster controller store it as awscluster.spec.elbName, and use that value for the ELB name. We would continue to use tags to associate the ELB with a cluster. I know the names won't be pretty, but they'd be unique.

chuckha commented 4 years ago

That sounds good. I'd want unique over nice names. Plus we can add the Name=nice-name tag and get a better looking AWS Console.

detiber commented 4 years ago

Doesn't the elb name affect the dns name provided? If so, it may be worth generating something nice'ish rather than a pure UUID.

ncdc commented 4 years ago

It does, but I doubt we can generate truly unique names unless it's a UUID. We could use the first 16 characters of the cluster name (the "nice" half) followed by 16 characters that would be the md5 hash of the full cluster name, truncated to the first 64 bits of the hash. I imagine the risk of collision is low, but it's still possible.

ncdc commented 4 years ago

/unassign /remove-lifecycle active