Malformed provider ID crashes cloud-controller-manager

chewong commented 1 year ago

What happened:

E0416 22:04:28.488055      11 runtime.go:79] Observed a panic: &errors.errorString{s:"unable to calculate an index entry for key \"<node-name>\" on index \"instanceID\": error mapping node \"<node-name>\"'s provider ID \"foo://com-2351\" to instance ID: Invalid format for AWS instance (foo://com-2351)"} (unable to calculate an index entry for key "<node-name>" on index "instanceID": error mapping node "<node-name>"'s provider ID "foo://com-2351" to instance ID: Invalid format for AWS instance (foo://com-2351))

Users can self-DOS if they manually edit the node's provider ID to a malformed string. I am wondering if this is by design or we should ignore this error.

https://github.com/kubernetes/cloud-provider-aws/blob/master/pkg/providers/v1/aws.go#L853

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

/kind bug

k8s-ci-robot commented 1 year ago

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

kmala commented 1 year ago

The provider ID is updated by the k8s based on the instance ID returned by the CCM. So, the better way i think would be to skip the instance instead of failing. We can ignore instance with bad provider id similar to instances with no provider id https://github.com/kubernetes/cloud-provider-aws/blob/master/pkg/providers/v1/aws.go#L853

mmerkes commented 1 year ago

Is there any downside if we skip the instance? In the case where the provider ID is not set, the assumption is that eventually it will be. Obviously, we don't want to crash CCM is this is wrong, and I don't know if there's anything to do but ignore it.

tzneal commented 1 year ago

/cc

kubernetes / cloud-provider-aws

Malformed provider ID crashes cloud-controller-manager #604