MAAP-Project / maap-hec-aws

2 stars 0 forks source link

R3: Use the ARC/NAS data caching mechanism in stage-in. #122

Open jjacob7734 opened 1 year ago

jjacob7734 commented 1 year ago

The ARC/NAS data caching solution is described at https://drive.google.com/file/d/1bvEHzfGwCsyLgkVqsdmuytiHGNBKb7Sv/view, and relies on the list input data URLs being available at the start of the workflow. For the public and private S3 bucket use cases, configure the CI/CD for the workflow to accept this input file list as an input parameter and make the stage-in step skip the download when it is expected to be done in the caching implementation on Pleiades. The ADES-PBS needs to read the input file list and inject PBS directives for the caching into the PBS bash script, as described in the caching design document linked above. For private bucket, AWS keys need to be installed in ~/.aws/credentials.

Definition of Done: