Automatically make the /bioproject/{bioprojects}/metadata dir and populate the metadata.tsv file with sample-ids when fetching the actual sequencing data.
Fetch the SRA metadata from NCBI using their E-utilities API and save in /bioproject/{bioprojects}/metadata/metadata-ncbi.tsv.
The "Run" column of the NCBI metadata should match the first column of the metadata.tsv but this may not always be the case (since the data is being downloaded from www.ebi.ac.uk?) hence we still use the metadata.tsv file.
Note - I would also like to add a column name row to metadata.csv to better document additional metadata we might add. I haven't done this yet is it will require modifying prepare-dashboard.py (and maybe other files?).
This PR adds a few features:
/bioproject/{bioprojects}/metadata
dir and populate themetadata.tsv
file with sample-ids when fetching the actual sequencing data.The "Run" column of the NCBI metadata should match the first column of the
metadata.tsv
but this may not always be the case (since the data is being downloaded from www.ebi.ac.uk?) hence we still use themetadata.tsv
file.Note - I would also like to add a column name row to
metadata.csv
to better document additional metadata we might add. I haven't done this yet is it will require modifyingprepare-dashboard.py
(and maybe other files?).