Open Ark-kun opened 5 years ago
there is a serious issue with this approach on AWS ebs volumes--the volumes will fail to attach and/or mount once you have two or more pods on different nodes. If your proposal here could be extended with some option to prefer scheduling pods on nodes that already have the volume attached (if allowed by resource requests), that might reduce the errors
there is a serious issue with this approach on AWS ebs volumes--the volumes will fail to attach and/or mount once you have two or more pods on different nodes. If your proposal here could be extended with some option to prefer scheduling pods on nodes that already have the volume attached (if allowed by resource requests), that might reduce the errors
Is it true that AWS does not support any multi-write volume types that work for any set of pods?
Here is a draft rewriter script: https://github.com/Ark-kun/pipelines/blob/SDK---Compiler---Added-support-for-volume-based-data-passing/sdk/python/kfp/compiler/_data_passing_using_volume.py It can be run as a command-line program to rewrite Argo Workflow from artifacts to volume-based data passing.
What does everyone think?
Hi @Ark-kun, I'm exploring Argo for a use-case where I want to:
Would this proposal support the step-level (or template-level) dynamic volume sizing that I'd need to implement this flow?
The per-step or per-artifact volumes could technically be implemented as another rewriting layer on top of the one in this issue. (My rewriter scrip will make it easier. You'll just need to change subPaths to volume names.)
This issue is more geared towards centralized data storage though.
Could you use PVCs for this?
Could you use PVCs for this?
If this is a question for me, then yes - the proposed feature and the implementation script are volume-agnostic. Any volume can be used and probably most users will specify some PVC even if only for a layer of indirection.
Is there any issue if I use AzureFiles as persistent volume as Argo since it provides concurrent access on volume which is the limitation with EBS ?
@Ark-kun are you still planning to implement this feature? The lifecycle of the artifacts in Argo could be an issue for us as it involves a lot of copying/downloading/uploading.
Also, how would you automatically remove the PVC at the end of the workflow? A typical workflow for us would be:
Also, how would you automatically remove the PVC at the end of the workflow? A typical workflow for us would be:
- setup a PVC
- get some data on S3
- step1: use data from S3 and generate new data on PVC
- step2: use data from step1 and generate new data on PVC
- etc...
- upload data generate by the last step on S3
- delete PVC
Any reason why not directly read/write S3? Is it because the libary doesn't support S3 interface?
We don't want to upload/download our data at each step for performance purposes. Using PVC solves this. We only use artifacts for the first and last steps.
@Ark-kun are you still planning to implement this feature?
I've implemented this feature back in October 2019 as a separate script which can transform a subset of DAG-based workflows:
Here is a draft rewriter script: https://github.com/Ark-kun/pipelines/blob/SDK---Compiler---Added-support-for-volume-based-data-passing/sdk/python/kfp/compiler/_data_passing_using_volume.py It can be run as a command-line program to rewrite Argo Workflow from artifacts to volume-based data passing.
I wonder whether we need to add it to the Argo controller itself (as it can just be used as a preprocessor). WDYT?
Also, how would you automatically remove the PVC at the end of the workflow?
It could be possible to do using exit handler and resource
templates.
My main scenario requires the volume to persist between restarts. This allows implementing the intermediate data caching so that when you run a modified pipeline it can skip already computed parts instead of running all steps every time. (There probably needs to be some garbage collection system that deletes the expired data.)
Relates to #4130 and #2551
Is this a BUG REPORT or FEATURE REQUEST?: FEATURE REQUEST
I'd like to implement a feature that automatically mounts a single volume to the workflow pods to passively orchestrate the data passing.
I'm working on implementing this feature and will submit PR once it's ready.
It's possible to implement now on top of Argo, but it might be nice to have it built-in. I've previously illustrated the proposal in https://github.com/argoproj/argo/pull/1227#issuecomment-472106438
The main idea is to substitute the Argo's "active" way of passing artifacts (copying, packing, uploading/downloading/unpacking) with a passive system that has many advantages:
Syntax (unresolved):
Transformed spec:
This system becomes becomes better with the https://github.com/argoproj/argo/issues/1329, https://github.com/argoproj/argo/issues/1348 and https://github.com/argoproj/argo/pull/1300 features.