Closed joannelynch92 closed 4 years ago
Thanks for the report! Looks like ParseNormalizedNamed
is returning an error, which we're ignoring. I'll have to dig a bit to see why we're ignoring the error and what exactly it's returning. I'll try to reproduce the problem and then fix it.
@joannelynch92 I created a pod in my own cluster using the jaegertracing/all-in-one:latest
image, and was able to run clusterlint successfully:
% clusterlint run -c latest-tag
[warning] default/pod/jaeger: Avoid using latest tag for container 'jaeger'
This makes me think that pod/container is not actually the one causing trouble in your case - but clearly some pod is causing us to crash. I've created PR #72 to add a check for the error we're ignoring; if you're able to build a clusterlint binary from that PR and run it on your cluster, I'd be interested to see the output of kubectl get -o yaml
for any pod that gets the new Image name for container '<name>' could not be parsed
warning.
If you're not able to build your own clusterlint from that branch, feel free to wait until we merge the PR and do a release, then give it a try.
As a bit of background: we were ignoring the error from reference.ParseNormalizedNamed
because that's the same function k8s itself calls to parse image names when you deploy a workload, so we expect images in a running workload would always have a name the function can parse. It seems like there's something running in your cluster that has an image name that doesn't parse successfully; I'm very curious what that might be :-).
So someone fixed the problem pod on the cluster over the weekend but I had the output of all the cluster's pods saved and noticed one image was missing its tag.
$ kubectl get pod hello-release-hello-world-76bc67557d-4g565 -o yaml
apiVersion: v1
kind: Pod
spec:
containers:
- image: 'redacted.dkr.ecr.us-east-1.amazonaws.com/redacted/redacted:'
status:
containerStatuses:
- image: 'redacted.dkr.ecr.us-east-1.amazonaws.com/redacted/redacted:'
imageID: ""
lastState: {}
name: hello-world
ready: false
restartCount: 0
state:
waiting:
message: 'Failed to apply default image tag "redacted.dkr.ecr.us-east-1.amazonaws.com/redacted/redacted:":
couldn''t parse image reference "redacted.dkr.ecr.us-east-1.amazonaws.com/redacted/redacted:":
invalid reference format'
reason: InvalidImageName
I put the broken image back in as a test and the clusterlint program crashed again (v0.1.3, not with changes) so looks like that was the image name at fault. Thanks for your help!
@joannelynch92 Ah, thanks for the update - I am able to reproduce the problem with a missing tag. #72 fixes the issue.
Closing this since it was fixed by #72.
I got the following error:
It looks like clusterlint errored on a latest tag because
clusterlint run ignore-checks latest-tag
ran successfully.The problem looks like it occurs because of a pod on my cluster that refers to a latest tag:
See status.containerStatuses.containerID.image for where the problem is.