argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
15.11k stars 3.21k forks source link

Allow to specify a PVC as log temporary folder #12661

Open e-tip opened 9 months ago

e-tip commented 9 months ago

Summary

I'd like to have a way to specify a PVC to be mounted into the wait container so the logs output can use this pvc instead of the ephemeral storage. Unfortunately uploading logs to s3 is quite hard if i have to save all the output of a workflow in a single file. I've used fluentbit with the s3 plugin, but it requires a variable part on the file key in order to be sure to not overwrite the existing data. I know that it's clear in the documentation that you suggest to not rely on argowf for logs archiving, but if one doesn't need to process the logs but only archive it on s3, it's still the best solution

Use Cases

When an application logs a lot of data and the archiveLogs option is true, argowf relies on the ephemeral-storage of the wait container to save a temporary file, which leads, if a worker node has little disk space, to be filled up. To prevent this we can set the resource limit on the wait-container, but this makes the pod to fail if the log is bigger than the limit. To be able to use the archiveLog option without worrying too much of the disk space, setting the tmp location to a PVC could be a good idea


Message from the maintainers:

Love this enhancement proposal? Give it a 👍. We prioritize the proposals with the most 👍.

agilgur5 commented 9 months ago

For reference, Slack thread about this.

You can use podSpecPatch to modify the wait container and add a volume

e-tip commented 9 months ago

i tried, but in this way i can't use the volumeClaimTemplates, but i need to use an existing PVC, right ? And in case of a DAG workflow with a withItems, pods should overwrite the log each other