elixir-europe / MARS

Multi-Repository Data Submission using ISA-JSON
MIT License
6 stars 8 forks source link

Map data files to repository #87

Closed kdp-cloud closed 2 weeks ago

kdp-cloud commented 3 weeks ago

CLI does not require more options but will try to map all data files passed in the command to a repository, based on the ISA-JSON.

Current logic:

Closes #78

kdp-cloud commented 3 weeks ago

Just to be sure a test submission:

############# Welcome to the MARS CLI. #############
Running in Development environment
Starting submission of the ISA JSON to the target repositories: biosamples, ena.
ISA JSON with investigation 'Bob's investigation' is valid.
Submission to biosamples was successful. Result:
{'targetRepository': 'biosamples', 'errors': [], 'info': [], 'accessions': [{'value': 'SAMEA131435379', 'path': [{'key': 'investigation'}, {'key': 'studies', 'where': {'key': 'title', 'value': 'Arabidopsis thaliana'}}, {'key': 'materials'}, {'key': 'samples', 'where': {'key': 'name', 'value': 'leaf 1'}}]}, {'value': 'SAMEA131435378', 'path': [{'key': 'investigation'}, {'key': 'studies', 'where': {'key': 'title', 'value': 'Arabidopsis thaliana'}}, {'key': 'materials'}, {'key': 'sources', 'where': {'key': 'name', 'value': 'plant 1'}}]}]}
#sample/331: comments=[] id='#ontology_annotation/accession_#sample/331' annotationValue='SAMEA131435379' termAccession=None termSource=None.
#source/330: comments=[] id='#ontology_annotation/accession_#source/330' annotationValue='SAMEA131435378' termAccession=None termSource=None.
Uploading ENA_TEST2.R2g.fastq.gz to FTP
Start submitting to ena.
Submission to ena was successful. Result:
{'targetRepository': 'ena', 'errors': [], 'info': [{'message': 'This submission is a TEST submission and will be discarded within 24 hours'}], 'accessions': [{'value': 'ERP166052', 'path': [{'key': 'investigation'}, {'key': 'studies', 'where': {'key': 'title', 'value': 'Arabidopsis thaliana'}}]}, {'value': 'ERX13337714', 'path': [{'key': 'investigation'}, {'key': 'studies', 'where': {'key': 'title', 'value': 'Arabidopsis thaliana'}}, {'key': 'assays', 'where': {'key': '@id', 'value': '#assay/18_20_21'}}, {'key': 'materials'}, {'key': 'otherMaterials', 'where': {'key': '@id', 'value': '#other_material/332'}}]}, {'value': 'ERX13337715', 'path': [{'key': 'investigation'}, {'key': 'studies', 'where': {'key': 'title', 'value': 'Arabidopsis thaliana'}}, {'key': 'assays', 'where': {'key': '@id', 'value': '#assay/18_20_21'}}, {'key': 'materials'}, {'key': 'otherMaterials', 'where': {'key': '@id', 'value': '#other_material/333'}}]}]}
Update ISA-JSON based on receipt from ena.
#other_material/332: comments=[] id='#ontology_annotation/accession_#other_material/332' annotationValue='ERX13337714' termAccession=None termSource=None.
#other_material/333: comments=[] id='#ontology_annotation/accession_#other_material/333' annotationValue='ERX13337715' termAccession=None termSource=None.
apriltuesday commented 2 weeks ago

CLI does not require more options

How disappointing, every new feature should come with at least 3 new options :wink:

kdp-cloud commented 2 weeks ago

The code looks good, but while I was reviewing this, I started wondering: do we even need the --data-files option? Or can we get rid of it and just use the files passed in the ISA-JSON directly?

Is it because we can't use a full filepath in the ISA-JSON, or the full path might change between the ISA-JSON producer and the broker?

Indeed, we need to know where the data files are located and this information is not in the ISA-JSON. But I agree that we should find a way to reduce the number of options and arguments...

bedroesb commented 2 weeks ago

@kdp-cloud did you by any chance also do a metabolights test run?

kdp-cloud commented 2 weeks ago

@kdp-cloud did you by any chance also do a metabolights test run?

No sorry! I don't have credentials yet for submitting to metabolights (shame on me! :smile: ).