datacarpentry / wrangling-genomics

Data Wrangling and Processing for Genomics
https://datacarpentry.org/wrangling-genomics/
Other
71 stars 151 forks source link

Use of SRA Explorer to Download FastQC Files ? #217

Open jcjimbo opened 4 years ago

jcjimbo commented 4 years ago

Hello datacarpentry,

This is my first time making a comment as a new instructor. Apologies, I am not amazing with git!

I wondered if this would be a suitable point to add as I have found this incredibly useful during my time downloading fastq files. This pertains to the Assessing Read Quality episode.

I utilised this website https://sra-explorer.info/# to obtain script for downloading SRA runs and this also produces the output bash script for downloading any SRA file.

e.g. for the accession SRR2589044:

curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR258/004/SRR2589044/SRR2589044_1.fastq.gz -o SRR2589044_REL2181A_1.fastq.gz curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR258/004/SRR2589044/SRR2589044_2.fastq.gz -o SRR2589044_REL2181A_2.fastq.gz

Changing the curl arguments to define the location of the download (ftp) and the output which has an altered name specific to the experiment.

The only issue is this would alter the code from the workshop which is:

curl -O ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR258/004/SRR2589044/SRR2589044_1.fastq.gz curl -O ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR258/004/SRR2589044/SRR2589044_2.fastq.gz

I wonder if this would complicate the lesson. However, the process of applying the lesson to their own datasets / datasets they wish to analyse may be more applicable.

I have not seen this referenced previously. Looking forward to knowing your thoughts.