elixir-cloud-aai / TESK

GA4GH Task Execution Service Root Project + Deployment scripts on Kubernetes
https://tesk.readthedocs.io
Apache License 2.0
40 stars 29 forks source link

S3 input/output support #20

Closed erikvdbergh closed 1 year ago

erikvdbergh commented 6 years ago

Filer needs to support S3 upload and download (public buckets)

Not sure if private buckets should also be supported.

susheel commented 6 years ago

Inputs Test Setup

  1. Build multi-filer container
  2. Obtain S3 cerdentials from Gianni (see slack)

Inputs Test Script

  1. Create a test.file in `s3://tesk-bucket/test/test.file
  2. Create test.json file
    {
    "inputs": [
      {
        "name": "input1",
        "description": "Example",
        "url": "s3://tesk-bucket/test/test.file",
        "path": "/tmp/test.file",
        "type": "FILE"
      }
    ]
    }
  3. Run multi-filer
    docker run -it <multi-filer CONTAINER> \
    -v $(PWD):/globus/
    -e SRC_S3_ACCESS_KEY_ID='<access-key-id>' \
    -e SRC_S3_SECRET_ACCESS_KEY='<secret-access-key>' \
    inputs test.json
psafont commented 6 years ago

This is the command that worked for me:

$ docker run -v $PWD:/globus/ \
  -e SRC_S3_ACCESS_KEY_ID=<snipped> \
  -e SRC_S3_SECRET_ACCESS_KEY=<snipped> \
  susheel/multi-filer:latest \
  /multi-filer.sh inputs /globus/test.json

output:

/multi-transfer.sh s3://tesk-bucket/test/test.file /tmp/test.file

It finished without errors, but how do I make sure the files was uploaded correctly?

susheel commented 6 years ago

Sorry I missed this. I would try uploading a file using the tool first and then try downloading the same file.

psafont commented 6 years ago

Would this be the correct command line for outputting a file?

$ docker run -v $PWD:/globus/ \
  -e SRC_S3_ACCESS_KEY_ID=<snipped> \
  -e SRC_S3_SECRET_ACCESS_KEY=<snipped> \
  susheel/multi-filer:latest \
  /multi-filer.sh outputs /globus/test.json

It's not clear to me how the file from the bucket is chosen to output to local system.

or what is the relation between input/output and import/export.

susheel commented 6 years ago

Try this...

S3 File Upload Test Script

  1. Create a dummy test.file in you local $(pwd)

  2. Rebuild multi-filer - I forgot to remove the logging comments

  3. Use this test.json

    {
    "inputs": [
      {
        "name": "input1",
        "description": "Example Input",
        "url": "s3://tesk-bucket/test/test.file",
        "path": "/globus/test.file",
        "type": "FILE"
      }
    ],
    "outputs": [
      {
        "name": "output1",
        "description": "Example Output",
        "url": "s3://tesk-bucket/test/test.file",
        "path": "/globus/test.file",
        "type": "FILE"
      }
    ]
    }
  4. Upload using multi-filer for outputs

    docker run -v $(pwd):/globus/ \
    -e DEST_S3_ACCESS_KEY_ID=<snipped> \
    -e DEST_S3_SECRET_ACCESS_KEY=<snipped> \
    susheel/multi-filer:latest \
    /multi-filer.sh outputs /globus/test.json

    This will have created placed a new file

  5. Download using multi-filer for inputs

    docker run -v $(pwd):/globus/ \
    -e SRC_S3_ACCESS_KEY_ID=<snipped> \
    -e SRC_S3_SECRET_ACCESS_KEY=<snipped> \
    susheel/multi-filer:latest \
    /multi-filer.sh inputs /globus/test.json
psafont commented 6 years ago

I'm not able to upload the dummy file:

2018/05/18 11:45:02 ERROR : : error reading destination directory: AllAccessDisabled: All access to this object has been disabled
        status code: 403
2018/05/18 11:45:02 ERROR : S3 bucket test path test.file/: not deleting files as there were IO errors
2018/05/18 11:45:02 ERROR : S3 bucket test path test.file/: not deleting directories as there were IO errors
2018/05/18 11:45:02 ERROR : Attempt 1/3 failed with 1 errors and: not deleting files as there were IO errors
kfox1111 commented 5 years ago

I'm interested in this too. Plus the ability to target minio as the s3 backend.