NBISweden / sda-cli

User command line interface for the SDA
GNU Affero General Public License v3.0
5 stars 1 forks source link

Feature/download dataset #408

Closed kostas-kou closed 2 months ago

kostas-kou commented 2 months ago

Related issue(s) and PR(s)
This PR closes #387.

Description A new feature in the sda-cli sda-download is implemented in order to download all the files of the dataset (decrypted) by adding the flag --dataset.

How to test Run the setup script from the root folder by typing: bash .github/integration/setup/setup.sh. When everything is ready and the containers are running download all the files of the https://doi.example/ty009.sfrrss/600.45asasgadataset by running: ./sda-cli sda-download -config testing/s3cmd-download.conf -datasetID https://doi.example/ty009.sfrrss/600.45asasga -url http://localhost:8080 -outdir test-dataset --dataset. Then check the test-dataset folder and 3 files in different paths will be present.

codecov-commenter commented 2 months ago

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 50.84746% with 58 lines in your changes missing coverage. Please review.

Project coverage is 43.16%. Comparing base (f258487) to head (64f9419).

Files Patch % Lines
sda_download/sda_download.go 56.81% 32 Missing and 6 partials :warning:
helpers/helpers.go 33.33% 20 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #408 +/- ## ========================================== - Coverage 43.41% 43.16% -0.26% ========================================== Files 13 13 Lines 1861 1923 +62 ========================================== + Hits 808 830 +22 - Misses 944 979 +35 - Partials 109 114 +5 ``` | [Flag](https://app.codecov.io/gh/NBISweden/sda-cli/pull/408/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=NBISweden) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/NBISweden/sda-cli/pull/408/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=NBISweden) | `43.16% <50.84%> (-0.26%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=NBISweden#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

kostas-kou commented 2 months ago

Works well when I tested. Good job @kostas-kou!

I have some suggestions for the command line syntax.

1. The flag `-datasetID` is not very good because the use of upper case letters. Firstly, it is not common to use upper case, and secondly, it is easy to make mistakes when typing the command when it has a combination of upper and lower case. I would suggest to change it to `-datasetid` or `-dataset-id`

2. The flag `-dataset` is not very clear to deliver the message that all files will be downloaded for the given dataset ID. I think `-all` will be better. Then the options will become `(--all | [filepath(s)])`, which is more intuitive

3. When the flag `-dataset` is set, and if one also append some filepaths, the command just went through and download all files in the dataset. I have added another comment in the code that the message `No files provided, downloading all files in the dataset` is a bit misleading, since all files will be downloaded even filepaths are provided.
   We can probably add a validation in the argument parsing so that when `-dataset` is set, error will be reported when filepaths are also provided. That is, either `-dataset` or `filepaths` list can be provided in the command, but not both.

The first and the third comment are fixed now. For the second one I cannot do anything since the task was saying to add the --dataset flag for downloading all the files. We had discussed that when the card was estimated and that was the decision of the team. Of course we can discuss it further if you would prefer a different flag and change it during a refactoring sprint

nanjiangshu commented 2 months ago

The first and the third comment are fixed now. For the second one I cannot do anything since the task was saying to add the --dataset flag for downloading all the files. We had discussed that when the card was estimated and that was the decision of the team. Of course we can discuss it further if you would prefer a different flag and change it during a refactoring sprint

I see. Obviously I have missed that there was a decision about the flag name. I'm fine with it for now.

kostas-kou commented 2 months ago

There where conflicts after the merged PR, I tried to resolve them on the fly and it failed. I am converting this PR to draft until everything is fixed