kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.86k stars 4.64k forks source link

Inconsistencies between qualified names on AWS nodes #16349

Closed rifelpet closed 2 months ago

rifelpet commented 7 months ago

/kind bug /kind failing-test

Our grid jobs for RHEL-based distros are failing a test that was recently unskipped for unrelated reasons (https://github.com/kubernetes/kops/pull/16176)

https://testgrid.k8s.io/kops-grid#kops-grid-cilium-amzn2-k28

https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/e2e-kops-grid-cilium-amzn2-k28/1756260864102502400

[FAIL] [sig-network] Networking Granular Checks: Services [It] should function for service endpoints using hostNetwork 
   [FAILED] failed dialing endpoint, did not find expected responses... 
  Tries 46
  Command curl -g -q -s 'http://100.96.4.93:9080/dial?request=hostname&protocol=http&host=100.66.81.213&port=80&tries=1'
  retrieved map[i-03b17693021906ac2.eu-west-1.compute.internal:{} i-03fbc6f079db37ce7.eu-west-1.compute.internal:{} i-0c90e87e766b90952.eu-west-1.compute.internal:{} i-0fd0d694876a8befc.eu-west-1.compute.internal:{}]
  expected map[i-03b17693021906ac2:{} i-03fbc6f079db37ce7:{} i-0c90e87e766b90952:{} i-0fd0d694876a8befc:{}]

This test expects unqualified names but is actually receiving fully qualified names. The test code's expected data comes from the kubernetes.io/hostname label on nodes (also the node name itself) which we see is the unqualified instance ID.

The test's actual data comes from running the hostname command on a hostNetwork pod.

A list of our distros and whether hostname returns a fully qualified name:

I think our best path forward would be to configure the RHEL-based distros to return the unqualified name for hostname. This would match behavior with the other distros.

Alternatively we could make all node names fully qualified like i-03fbc6f079db37ce7.eu-west-1.compute.internal but this feels more disruptive.

rifelpet commented 7 months ago

This relates to https://github.com/kubernetes/kubernetes/issues/121018 and the e2e test logic could be updated to handle either qualified or unqualified hostname outputs.

/kind office-hours

k8s-triage-robot commented 4 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 3 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 2 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 2 months ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes/kops/issues/16349#issuecomment-2221101998): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.