ebi-ait / hca-ebi-dev-team

Repository for hca ebi dev team agile management. See zenhub board
0 stars 0 forks source link

Sync files from multiple source buckets using HCA (DCP) CLI #88

Open aaclan-ebi opened 4 years ago

aaclan-ebi commented 4 years ago

This ticket should make Upload Service configurable to allow s3 to s3 transfer from multiple buckets and makes it possible for the user to do the following:

Assuming upload service is configured to sync from either arn:aws:s3:::test-bucket1 arn:aws:s3:::test-bucket2

Steps:

  1. Create ingest submission via UI or via API. Get the upload area.

upload area: s3://org-hca-data-archive-upload-dev//

  1. Sync the files either from bucket1 or bucket2 to Ingest upload area
    hca upload select s3://org-hca-data-archive-upload-dev/<submission-uuid>/
hca upload files s3://test-bucket1/subdir/

or

hca upload files s3://test-bucket2/subdir/

Expected behavior:

Ingest upload area will contain the following files: test-file.txt test-file2.txt

Currently, using hca cli, it is possible to sync files from a source bucket to an ingest submission upload area. However, it is only possible to configure only 1 bucket atm by setting staging_bucket_arn terraform variable which is being used in the following code: https://github.com/ebi-ait/upload-service/blob/master/upload/common/upload_area.py#L118

justincc commented 4 years ago

How important is this?

aaclan-ebi commented 4 years ago

This may no longer be relevant if all functionalities of hca cli is already supported in hca-util.

justincc commented 4 years ago

See also #82, #87

aaclan-ebi commented 4 years ago

Moving this to Product Backlog lane since this is not part of current sprint goal.

prabh-t commented 4 years ago

does the ability to sync from different hca-util upload areas solve this issue? e.g.

hca-util select AREA1
hca-util sync INGEST_UPLOAD_AREA

then, from another area:

hca-util select AREA2
hca-util sync INGEST_UPLOAD_AREA

This assumes that hca-util upload areas as different "sources".

aaclan-ebi commented 4 years ago

If syncing is already supported in hca-util tool , I believe this ticket is no longer needed.