argoproj / argo-rollouts

Progressive Delivery for Kubernetes
https://argo-rollouts.readthedocs.io/
Apache License 2.0
2.79k stars 873 forks source link

Job's label value exceeds max value of 63 characters. #795

Open TommyLike opened 4 years ago

TommyLike commented 4 years ago

Summary

Assume we have created an analysis template as below:

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: service-endpoint-reachable
spec:
  args:
  - name: service-fqdn
  - name: status-code
  metrics:
  - name: service-endpoint-reachable
    provider:
      job:
        spec:
          backoffLimit: 1
          template:
            spec:
              restartPolicy: Never
              containers:
                - name: curl-service-endpoint
                  image: curlimages/curl:7.73.0
                  command:
                    - /bin/sh
                    - -c
                    - |
                      curl -I {{args.service-fqdn}}
                      status_code=$(curl --write-out "%{http_code}\n" --silent --output /dev/null "{{args.service-fqdn}}");
                      [ $status_code -eq {{args.status-code}} ]

argo rollout will failed to create the wanted job due to the limit of max label value length(63) in kubernetes.

 message: 'Job.batch "990a01d4-ff15-402b-8516-0914c420e2b2.service-endpoint-reachable.5"
        is invalid: spec.template.labels: Invalid value: "990a01d4-ff15-402b-8516-0914c420e2b2.service-endpoint-reachable.5":
        must be no more than 63 characters'

I guess this is a known issue 1, but it's very likely to happen, due to the generation style.

Diagnostics

What version of Argo Rollouts are you running?

# Paste the logs from the rollout controller

# Logs for the entire controller:
kubectl logs -n argo-rollouts deployment/argo-rollouts

# Logs for a specific rollout:
kubectl logs -n argo-rollouts deployment/argo-rollouts | grep rollout=<ROLLOUTNAME>

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

davidxia commented 3 years ago

I'm running into this issue as well. As a user of argo-rollouts, I'd like my attempts to create [Cluster]AnalysisTemplate with metric names that are too long to fail at creation time instead of failing later at AnalysisRun run-time. Perhaps the code @TommyLike referenced above can check the Job name length and hash or truncate the provided name in some way?

https://github.com/argoproj/argo-rollouts/blob/96ba0302f4d904e9aba36c3341025f4f51eaed81/metricproviders/job/job.go#L57-L63

jstewart612 commented 3 years ago

As a workaround for this issue, I have truncated all my job names to be 20 characters or less, because that's how many characters it was over in my case. I don't know if the length is variable in what is generated here or not by Argo Rollouts, but I haven't had the issue since. I would assume it's only variable if the number of jobs in the analysistemplate is double-digits or greater, as I would expect the UUID to be of fixed length that seems to be prepended to this label.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 60 days with no activity.

SleepyBrett commented 4 months ago

It is unacceptable for this issue to be stale. It persists still. If you are using labels to find the child job you should break this up and use multiple labels.

kostis-codefresh commented 4 months ago

What would be the recommended solution here? Simply truncating the name would seem to be the quickest workaround, but then we run the danger of having different jobs getting truncated to the same name. Or this is a not a possible/interesting use case?

Qasim-Aziz commented 4 months ago

facing same issue