Allow using IP-based naming with custom domain and CCM

ValeriiVozniuk commented 2 years ago

/kind feature

With switching to CCM in new kops versions the only available option is "resource based naming". And "IP-based naming" is broken in kops 1.23+ if custom domain is used. We've tried "resource based naming" with kops 1.23 on our test cluster, and everyone literally hates new nodes names based on aws instance id. I assume it could be useful in small clusters which are running within single region, as you don't need to do additional queries to find aws instance id, but in our multi-region setup it is really bad. Before you could easily see where are the instances running just by looking IP part in name in monitoring, now you need to somehow find out in what region that instance is running first, and they try to find it there. Minutes of wasted time instead of seconds of thinking. Thus, could you please provide an option for CCM to name nodes based on IP/fix custom domain handling to have it working like it was before kops 1.23? PS: I saw that there are some hostnameOverride options for kubelet/kubeProxy, but I was not able to make it working on cluster level, both with/without CCM. While it is present in spec, on actual node it is always amazon_node_id or dns name

Reference link: https://cloud-provider-aws.sigs.k8s.io/prerequisites/

olemarkus commented 2 years ago

IP based naming without using CCM was resolved with https://github.com/kubernetes/kops/pull/14024

However, when you upgrade to use CCM on a supported k8s version, resource based naming will indeed be forced. Your arguments seem based around a cluster running in multiple regions, but I don't see how that is possible. Are you saying you have such a setup?

hostnameOverride won't work. You can't set this to a static value.

Adding an option to use IP-based naming could be a feature worth adding though. Happy to review a PR.

rifelpet commented 2 years ago

It's also worth mentioning that kubectl plugins or aliases can easily visualize this information that otherwise takes minutes to find.

https://kubernetes.io/docs/reference/kubectl/#custom-columns

ValeriiVozniuk commented 2 years ago

Hi @olemarkus,

Thank you, for some reasons I didn't get a notification about #14024, and I see that release 1.23.3 was created, so will test it once binaries will became available :). About your question: sorry, I've described our setup not quite correctly (and incompletely). Of course, we have a cluster-per-region setup, but data from all clusters are gathered to centralized Grafana setup, where we have multiple dashboards/alerts/etc for monitoring our applications deployments per all clusters. While part of that information is shown in per-cluster view, many dashboards/alerts are set across all clusters for dev/ops convenience. And there are also alerts to Slack when something happened. With ip based names, it is easy to immediately see where is something happening without looking for/adding extra things, even in short Slack message. People who are in company for a long time, could easily recognize even instance group out of ip in name :). With resource based naming, all this useful information needs to be exposed in different ways, overcomplicating things for us. As for pull request, unfortunately my Go skills are still not enough to make a patch like this :)

@rifelpet Sorry, I should wrote that this is not about kubectl stuff :)

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

ValeriiVozniuk commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

ValeriiVozniuk commented 1 year ago

/remove-lifecycle stale

ValeriiVozniuk commented 1 year ago

@olemarkus Per what I see, the new naming scheme with CCM is enforced here https://github.com/kubernetes/kops/blob/master/upup/pkg/fi/cloudup/template_functions.go#L715 Any reasons why it is like that, or just removing that condition would be enough to achieve the goal here?

shapirus commented 1 year ago

Any update on this? It is still an issue, not sure if precisely the same: https://github.com/kubernetes/kops/issues/15891

The attribute that is used to retrieve the value for node name should ideally be user-configurable rather than calculated dynamically in an obscure and undocumented way.

hakman commented 1 year ago

@shapirus At the moment there is no work done by the maintainers to support this feature. We are happy to review contributions on this topic. It shouldn't be a big change, but will require some effort.

shapirus commented 1 year ago

Is the forced instanceID naming required by CCM? What was the rationale behind this change? Asking because there may be technical reasons behind it which I don't know about.

p.s. it's not really a feature. It's (in my particular case described in https://github.com/kubernetes/kops/issues/15891) reverting to original behavior.

hakman commented 1 year ago

Is the forced instanceID naming required by CCM?

No

What was the rationale behind this change? Asking because there may be technical reasons behind it which I don't know about.

This was required to support IPv6. It also fixed the numerous reports of not working clusters due to misconfigured DHCP options.

p.s. it's not really a feature. It's (in my particular case described in https://github.com/kubernetes/kops/issues/15891) reverting to original behavior.

At the moment, the old behaviour is gone. I understand that some users don't like the new behaviour, but changing it back to the way it was in kOps 1.22 requires an effort.

xiaoh163 commented 1 year ago

Any update on this?

At the moment, the old behaviour is gone. I understand that some users don't like the new behaviour, but changing it back to the way it was in kOps 1.22 requires an effort.

Is there an ETA to change it back to the way that was in the Kops 1.22? Thank you

hakman commented 1 year ago

Is there an ETA to change it back to the way that was in the Kops 1.22?

@xiaoh163 There is no ETA, as there is no work being done to change it back.

xiaoh163 commented 1 year ago

got it thank you

maxwheel commented 11 months ago

Any update on this?

At the moment, the old behaviour is gone. I understand that some users don't like the new behaviour, but changing it back to the way it was in kOps 1.22 requires an effort.

Is there an ETA to change it back to the way that was in the Kops 1.22? Thank you

We are facing the same problem... Reverting to the IP-based node name is MUCH appreciated!

maxwheel commented 11 months ago

We are facing the same problem... Reverting to the IP-based node name is MUCH appreciated!

We are facing the same problem... Reverting to the IP-based node name is MUCH appreciated!

olemarkus commented 11 months ago

I suggest adding some stronger reasoning to why this should be reintroduced. I don't consider 'metadata as part of the hostname' as much of a valid use case. As mentioned above, a simple API call either to AWS or to k8s will reveal much richer metadata.

shapirus commented 11 months ago

Having them as hostnames at least provides a better user experience (in most cases; sometimes you actually do need the providerID value): easier visual recognition of the host in the ouput of kubectl <top|get> nodes, being able to ssh into the copy-pasted host name taken from that output without extra wrapper scripts, etc.

Besides, it was a change of behavior made for no immediately apparent reason, avoiding which is generally considered a good practice.

xqzhang2015 commented 7 months ago

Hi @olemarkus Could I confirm if there is any action or plan to support a switch for UseInstanceIDForNodeName or IP-based name?

Some challenges for me: Actually, in our cluster, we only use IPv4(Single-stack), and no requirement to enable Dual-stack or IPv6. But the enforcement nodeName change breaks our monitor(Pod-Node 1:1 deployment and instance CPU monitoring), and HPA/Prometheus adapters can't work after K8S cluster is upgraded to 1.24.

BTW, we also deploy multi k8s clusters in different AWS regions/data centers. DNS-name style node-name is much more user-friendly for troubleshooting.

Currently, we don't have much resources to do adapting work for AWS instance-ID as nodeName.

Regards.

olemarkus commented 7 months ago

As far as I know, no one is working on this, no.

kubernetes / kops

Allow using IP-based naming with custom domain and CCM #14035