Open GuilleAmutio opened 2 months ago
Thanks for the request @GuilleAmutio! I was just wondering, what advantages would creating ad-hoc PVCs for each flow run have over maintaining one or more long-lived PVCs that can be used between flow runs?
Hey @desertaxle thank you for taking the time to read the feature request.
At the start I also though about having one or more long-lived PVCs but my use-case might be quite special.
Our jobs always run in the /opt/prefect directory, we created a custom image to run a script on start start.sh
. That script does mutliple things like downloading files. So if another job, at the same time, is launched, it can corrupt the files downloaded by the other job, which is still running, because that job also has this startup script.
I am aware that this might be a very-specific scenario and i am thinking of a workaround while writing this but might require some scripting from my side hahaha.
Anyways, I would to add some keypoints about why defining PVC on the worker.json definition might be useful:
Thanks for that info!
It might make sense to create an ad-hoc PVC via an initContainer
on the Kubernetes job.
Your idea to declare the job and the volume manifest in the same job configuration is interesting, but I don't think the work pool job configuration mechanics could pass the name of a created PVC to the job manifest without some big changes.
The ordering and dependency of the resource creation might work well using Helm instead of manifests, but we don't currently support Helm charts.
I don't see an obvious solution to this right now, but I'll keep thinking. If anyone else reading this thread has any ideas, don't hesitate to comment!
Thanks for the response!
Indeed, the initContainer
was one of the workarounds I thought could make the work, but the issue lays on the Job lifecycle.
I will make a temporary script for the main container image, where it creates a dedicated folder in the PVC for each pod.
Describe the current behavior
I would like to attach a PVC for my Prefect flow runs as some of them create large files and they end up taking the RAM assgined, leading to a 128 error.
I don't know if there is already a defined solution but after checking the documetnation and reaching on Slack, I still don't know how to approach this issue.
Describe the proposed behavior
In our scenario, our flows run as kubernetes jobs. Using dynamic storage, like efs-driver, would allow us to create PVC on the fly for every flow execution, leveraging the out of memory issue.
This could be defined in the same JSON template as the
job_configuration
is defined.Example Use
woker-comfig.json
Additional context
First of all, the reference to the pvc could be a little bit tricky as would be randomly generated so no other job can mount it.
Another way that I though, would be to allow an array of
job_manifests
, therefore allowing users to run jobs to configure this kind of things. If this was a possibility, then I could create the pvc from another job.