ccagc / QDNAseq

QDNAseq package for Bioconductor
47 stars 27 forks source link

Downloading 1000 Genomes samples is inactive #118

Open kojimak324 opened 10 months ago

kojimak324 commented 10 months ago

I read "5. Downloading 1000 Genomes samples" in Introduction to QDNAseq[https://bioconductor.org/packages/release/bioc/vignettes/QDNAseq/inst/doc/QDNAseq.pdf].

I can't download fast file. urlroot <- "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp" This URL does not look good.

I used this link as a reference. https://github.com/ccagc/QDNAseq/issues/59

urlroot <- "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/historical_data/former_toplevel" Using this URL, read.table for sequence.index worked fine.

But,

for (i in rownames(g1k)) { sourceFile <- file.path(urlroot, g1k[i, "FASTQ_FILE"]) destFile <- g1k[i, "fileName"] if (!file.exists(destFile)) download.file(sourceFile, destFile, mode="wb") }

This part does not work well. The following are error codes.


trying URL 'ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3//data/NA18624/sequence_read/ERR008841.filt.fastq.gz' Content type 'unknown' length 1196873909 bytes (1141.4 MB)

Error in download.file(sourceFile, destFile, mode = "wb") : cannot open URL 'ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3//data/NA18624/sequence_read/ERR008841.filt.fastq.gz' In addition: Warning messages: 1: In download.file(sourceFile, destFile, mode = "wb") : downloaded length 0 != reported length 0 2: In download.file(sourceFile, destFile, mode = "wb") : URL 'ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3//data/NA18624/sequence_read/ERR008841.filt.fastq.gz': Timeout of 60 seconds was reached


Similarly, referring to the previous link below, seqroot = "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/" before this process. I have tried to do this.

However, I get the same error. How can I download 1000 genome samples?