fstrozzi / rnaseq-encode-nf

An example RNA-seq pipeline with Nextflow and using public ENCODE data
0 stars 2 forks source link

It fails to download some samples #1

Open pditommaso opened 7 years ago

pditommaso commented 7 years ago
$ wget  ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR504/004/SRR5048134/SRR5048134_1.fastq.gz
--2017-10-21 11:47:09--  ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR504/004/SRR5048134/SRR5048134_1.fastq.gz
           => ‘SRR5048134_1.fastq.gz’
Resolving ftp.sra.ebi.ac.uk (ftp.sra.ebi.ac.uk)... 193.62.192.7
Connecting to ftp.sra.ebi.ac.uk (ftp.sra.ebi.ac.uk)|193.62.192.7|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /vol1/fastq/SRR504/004/SRR5048134 ... 
No such directory ‘vol1/fastq/SRR504/004/SRR5048134’.
fstrozzi commented 7 years ago

It appears that not all the samples have been stored the same way. Some are stored as FASTQ others as SRA files (NCBI format). In this case the link is ftp://ftp.sra.ebi.ac.uk/vol1/srr/SRR504/004/SRR5048134. Either we clean up the list of files from these cases or I can implement a check and download those and use the SRA-utils to convert the file into actual FASTQ.

pditommaso commented 7 years ago

I see. The way you think it's easier. The use of SRA-utils would require to rebuild the container, etc. It looks overkill.