argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
15.08k stars 3.2k forks source link

Max pod name length too long #11356

Open matt-carr opened 1 year ago

matt-carr commented 1 year ago

Pre-requisites

What happened/what you expected to happen?

Observed behaviour: When running a workflow with nested templates, if the template names are long enough the generated pod names exceed the 63 character conventional limit.

Expected behaviour: The pod names should be truncated or otherwise limited, similar to names generated by replicasets, etc.

Kubernetes documentation is a little weird on this case - https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#dns-subdomain-names says that object names should be valid dns subdomain names, which means they're limited at 253 characters (and this is the limit you use in your code, as far as I can tell from https://github.com/argoproj/argo-workflows/issues/7896 ), but the RFC it links to indicates that this is the limit for the fully qualified name, and section 6.1.3.5 Extensibility says

The DNS defines domain name syntax very generally -- a string of labels each containing up to 63 8-bit octets, separated by dots, and with a maximum total of 255

indicating that object names should, in fact, max out at 63 characters as an individual label name. Kubernetes enforcing a 63-character limit on pod names by default seems to indicate that this is the intended behaviour - trying to create a pod with a name >63 characters fails

* spec.containers[0].name: Invalid value: "foobar-work-flow-8zdl7-a-very-long-template-name-example-3607763690": must be no more than 63 characters

and pods that are created by any pod-sets (replicasets, etc) truncate appropriately

I'm not particularly fluent in Go but I think this would be simple enough to tackle on my own, if this is an acceptable issue

Version

v3.4.4

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: foobar-work-flow
spec:
  entrypoint: main-workflow
  templates:
  - name: main-workflow
    steps:
      - - name: a-very-long-template-name-example
          template: a-very-long-template-name-example
  - name: a-very-long-template-name-example
    container:
        image: docker/whalesay
        command: [ cowsay ]
        args: [ "hello world" ]
        resources:
          limits:
            memory: 32Mi
            cpu: 100m

Logs from the workflow controller

(not relevant, this is the generated pod name)
foobar-work-flow-8zdl7-a-very-long-template-name-example-3607763690

Logs from in your workflow's wait container

n/a
terrytangyuan commented 1 year ago

@isubasinghe @JPZ13 Could you help take a look?

isubasinghe commented 1 year ago

@terrytangyuan I am on holidays atm, I can get a review done tomorrow, a bit busy tonight.

JPZ13 commented 1 year ago

I can review as well @terrytangyuan

isubasinghe commented 1 year ago

I had another look at this, I was wrong, this is certainly still an issue. The names we generate are too long. This is somewhat worrying to me.

We do generate a lot of names that are delimited by a '-', I wonder how many things would break if we changed this to a '.' That would give us access to more name length.

We should do this properly(at the source of new name generation, which likely means a lot of source code changes) instead of in the GeneratePodName function.

If anyone is reliant on the current structure of name generation (separated via dots) this would mean a breaking change.

A bit lost on what to do, any opinions @terrytangyuan?

terrytangyuan commented 1 year ago

It's likely that users who build inhouse UIs or services rely on the names so I am a bit concerned about changing "-" to "."

This issue happens when there are nested templates. I wonder if we could generate shorter aliases pod names instead of concatenating template names.

isubasinghe commented 1 year ago

@terrytangyuan that is a fair concern, I think it maybe possible to generate shorter aliased pod names that way.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

mikutas commented 1 year ago

Since Kubernetes 1.27, workflow-controller receives warning from kube-apiserver when workload names are not DNS labels.

metadata.name: this is used in the Pod's hostname, which can result in surprising behavior; a DNS label is recommended: [must be no more than 63 characters]

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#api-change-3 https://github.com/kubernetes/kubernetes/pull/114412

terrytangyuan commented 1 year ago

Could you paste the warning message?

terrytangyuan commented 1 year ago

Found the specific changelog item:

Added warnings about workload resources (Pods, ReplicaSets, Deployments, Jobs, CronJobs, or ReplicationControllers) whose names are not valid DNS labels. (https://github.com/kubernetes/kubernetes/pull/114412, @thockin)

Garett-MacGowan commented 8 months ago

+1 on this. I just ran into this issue.

rcontreras-te commented 2 months ago

We just ran into this. Is this actually being worked on? Our current workaround is to simply rework the workflow so as to use shorter names when using nested templates. Are there any other suggestions?

tooptoop4 commented 1 week ago

when u say issue what breaks? it seems like harmless warning log