Volume-based artifact passing system

Ark-kun commented 5 years ago

Is this a BUG REPORT or FEATURE REQUEST?: FEATURE REQUEST

I'd like to implement a feature that automatically mounts a single volume to the workflow pods to passively orchestrate the data passing.

I'm working on implementing this feature and will submit PR once it's ready.

It's possible to implement now on top of Argo, but it might be nice to have it built-in. I've previously illustrated the proposal in https://github.com/argoproj/argo/pull/1227#issuecomment-472106438

The main idea is to substitute the Argo's "active" way of passing artifacts (copying, packing, uploading/downloading/unpacking) with a passive system that has many advantages:

Much faster artifact storage I/O. No packing/unpacking. No copying files.
Artifact size is not limited by main/wait container disk sizes.

Syntax (unresolved):

# New syntax
artifactStorage: 
  volume: # Will automatically mount this volume to all Pods in a particular way
    persistentVolumeClaim:
      claimName: vol01

#The rest of the code is the usual artifact-passing syntax
templates:
  - name: producer
    outputs:
      artifacts:
      - name: out-art1
        path: /argo/outputs/out-art1/data
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["cowsay hello world > /argo/outputs/out-art1/data"]

  - name: consumer
    inputs:
      artifacts:
      - name: in-art1
        path: /argo/inputs/in-art1/data
    container:
      image: docker/whalesay:latest
      command: [cat, '/argo/inputs/in-art1/data']

  - name: main
    dag:
      tasks:
      - name: producer-task
        template: producer
      - name: consumer-task
        template: consumer
        arguments:
          artifacts:
          - name: in-art1
            from: "{{tasks.producer-task.outputs.artifacts.out-art1}}"

Transformed spec:

volumes:
  - name: argo-storage
    persistentVolumeClaim:
      claimName: vol01
templates:
  - name: producer
    outputs:
      parameters:
      - name: out-art1-subpath
        value: "{{workflow.uid}}/{{pod.name}}/out-art1/"
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["cowsay hello world > /argo/outputs/out-art1/data"]
      volumeMounts:
        - name: argo-storage
          mountPath: /argo/outputs/out-art1/
          subPath: "{{workflow.uid}}/{{pod.name}}/out-art1/"

  - name: consumer
    inputs:
      parameters:
      - name: in-art1-subpath
    container:
      image: docker/whalesay:latest
      command: [cat, '/argo/inputs/in-art1/data']
      volumeMounts:
        - name: argo-storage
          mountPath: /argo/inputs/in-art1/
          subPath: "{{input.parameters.in-art1-subpath}}"
          readOnly: true

  - name: main
    dag:
      tasks:
      - name: producer-task
        template: producer
      - name: consumer-task
        template: consumer
        arguments:
          artifacts:
          - name: in-art1-subpath
            from: "{{tasks.producer-task.outputs.parameters.out-art1-subpath}}"

This system becomes becomes better with the https://github.com/argoproj/argo/issues/1329, https://github.com/argoproj/argo/issues/1348 and https://github.com/argoproj/argo/pull/1300 features.

https://github.com/argoproj/argo/pull/1300 adds the ability to reference the artifact paths which is required for using generated paths
https://github.com/argoproj/argo/issues/1348 adds the ability to generate artifact paths. This ensures that the paths are compatible with volume mounts.
https://github.com/argoproj/argo/issues/1329 allows passing the resolved paths to the downstream tasks without having to hack the command-line.

JoshRagem commented 5 years ago

there is a serious issue with this approach on AWS ebs volumes--the volumes will fail to attach and/or mount once you have two or more pods on different nodes. If your proposal here could be extended with some option to prefer scheduling pods on nodes that already have the volume attached (if allowed by resource requests), that might reduce the errors

Ark-kun commented 5 years ago

there is a serious issue with this approach on AWS ebs volumes--the volumes will fail to attach and/or mount once you have two or more pods on different nodes. If your proposal here could be extended with some option to prefer scheduling pods on nodes that already have the volume attached (if allowed by resource requests), that might reduce the errors

Is it true that AWS does not support any multi-write volume types that work for any set of pods?

Ark-kun commented 4 years ago

Here is a draft rewriter script: https://github.com/Ark-kun/pipelines/blob/SDK---Compiler---Added-support-for-volume-based-data-passing/sdk/python/kfp/compiler/_data_passing_using_volume.py It can be run as a command-line program to rewrite Argo Workflow from artifacts to volume-based data passing.

What does everyone think?

danxmoran commented 4 years ago

Hi @Ark-kun, I'm exploring Argo for a use-case where I want to:

Query metadata about a file in external storage (i.e. FTP), outputting its size
Dynamically generate a volume big enough to store the downloaded file
Download the file to the volume
Mount the volume in a separate task, for processing

Would this proposal support the step-level (or template-level) dynamic volume sizing that I'd need to implement this flow?

Ark-kun commented 4 years ago

The per-step or per-artifact volumes could technically be implemented as another rewriting layer on top of the one in this issue. (My rewriter scrip will make it easier. You'll just need to change subPaths to volume names.)

This issue is more geared towards centralized data storage though.

alexec commented 4 years ago

Could you use PVCs for this?

Ark-kun commented 4 years ago

Could you use PVCs for this?

If this is a question for me, then yes - the proposed feature and the implementation script are volume-agnostic. Any volume can be used and probably most users will specify some PVC even if only for a layer of indirection.

BlackRider97 commented 4 years ago

Is there any issue if I use AzureFiles as persistent volume as Argo since it provides concurrent access on volume which is the limitation with EBS ?

hadim commented 4 years ago

@Ark-kun are you still planning to implement this feature? The lifecycle of the artifacts in Argo could be an issue for us as it involves a lot of copying/downloading/uploading.

hadim commented 4 years ago

Also, how would you automatically remove the PVC at the end of the workflow? A typical workflow for us would be:

setup a PVC
get some data on S3
step1: use data from S3 and generate new data on PVC
step2: use data from step1 and generate new data on PVC
etc...
upload data generate by the last step on S3
delete PVC

rmgogogo commented 4 years ago

Also, how would you automatically remove the PVC at the end of the workflow? A typical workflow for us would be:

setup a PVC

get some data on S3

step1: use data from S3 and generate new data on PVC

step2: use data from step1 and generate new data on PVC

etc...

upload data generate by the last step on S3

delete PVC

Any reason why not directly read/write S3? Is it because the libary doesn't support S3 interface?

hadim commented 4 years ago

We don't want to upload/download our data at each step for performance purposes. Using PVC solves this. We only use artifacts for the first and last steps.

Ark-kun commented 4 years ago

@Ark-kun are you still planning to implement this feature?

I've implemented this feature back in October 2019 as a separate script which can transform a subset of DAG-based workflows:

Here is a draft rewriter script: https://github.com/Ark-kun/pipelines/blob/SDK---Compiler---Added-support-for-volume-based-data-passing/sdk/python/kfp/compiler/_data_passing_using_volume.py It can be run as a command-line program to rewrite Argo Workflow from artifacts to volume-based data passing.

I wonder whether we need to add it to the Argo controller itself (as it can just be used as a preprocessor). WDYT?

Also, how would you automatically remove the PVC at the end of the workflow?

It could be possible to do using exit handler and resource templates.

My main scenario requires the volume to persist between restarts. This allows implementing the intermediate data caching so that when you run a modified pipeline it can skip already computed parts instead of running all steps every time. (There probably needs to be some garbage collection system that deletes the expired data.)

alexec commented 3 years ago

Relates to #4130 and #2551

argoproj / argo-workflows

Volume-based artifact passing system #1349