argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
14.57k stars 3.12k forks source link

reduce pod definition size #13089

Open tooptoop4 opened 1 month ago

tooptoop4 commented 1 month ago

Summary

The pod spec for steps run from argo is quite large and can fill etcd. Wanting to trim this down, I imagine there will be 2 parts:

  1. code changes
  2. docs for configration of init vs main vs wait containers

thoughts/Qs:

  1. is ARGO_TEMPLATE env variable needed on all 3 containers?
  2. are volumeMounts needed on both main and wait containers?
  3. do all 3 containers need environment variables for communicating with s3 ie via artifactRepository? (i'm guessing main container doesn't)
  4. ARGO_TEMPLATE is huge, is the ARGO_TEMPLATE env variable containing things it doesn't need? perhaps some containers need things within that other containers don't?
  5. are ARGO_PROGRESS_PATCH_TICK_DURATION/ARGO_PROGRESS_FILE_TICK_DURATION/ARGO_INCLUDE_SCRIPT_OUTPUT/ARGO_PROGRESS_FILE env variables needed?
  6. do the commands on all the containers need --loglevel/--log-format?

Use Cases

ensure etcd does not fill up

jswxstw commented 1 month ago
  • is ARGO_TEMPLATE env variable needed on all 3 containers?

ARGO_TEMPLATE may be huge, #12325 provides an optimization solution for EnvVarTemplate offload, perhaps it can be made the default logic. However, its lifecycle is aligned with the workflow, not the pod, considering delete it when pod gc?

are ARGO_PROGRESS_PATCH_TICK_DURATION/ARGO_PROGRESS_FILE_TICK_DURATION/ARGO_INCLUDE_SCRIPT_OUTPUT/ARGO_PROGRESS_FILE env variables needed?

ARGO_PROGRESS_PATCH_TICK_DURATION/ARGO_PROGRESS_FILE_TICK_DURATION/ARGO_PROGRESS_FILE are used to implement self reporting progress. ARGO_INCLUDE_SCRIPT_OUTPUT is used to determine whether stdout needs to be saved.

I think optimizing the reuse of ARGO_TEMPLATE would be sufficient. The other aspects have minimal impact, there's no need to be overly demanding.

tooptoop4 commented 1 month ago

@jswxstw do u know what ARGO_TEMPLATE is for? and do all 3 containers need it?

jswxstw commented 1 month ago

@jswxstw do u know what ARGO_TEMPLATE is for? and do all 3 containers need it?

All 3 containers need ARGO_TEMPLATE to prepare inputs or save outputs, and in some types of templates like Script/Resourct/ContainerSet, it serves other purposes as well.