Open gurpreet-bioinfo opened 3 years ago
Yah it seems the Ensembl folder structure / names are not what's expected by the download code, for release 84 and 104 (that I've checked)
Though the problem I was seeing was not exactly like what you were seeing, @gurpreet-bioinfo, (I saw a failure to download the genome sequence: "Error in download.file(file.path("ftp://ftp.ensembl.org/pub", paste("release-", : cannot open URL 'ftp://ftp.ensembl.org/pub/release-104/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz'"), I fixed this error (so far) by editing the download.file command at the end of the lib/R/DownloadDB.R script from:
download.file(file.path("ftp://ftp.ensembl.org/pub", paste("release-", ensversion, sep = ""), "fasta", myrefid, "dna", myrefid.path), destfile = file.path(tmpfolder, "seq.fa.gz"), quiet = T)
to:
tempcmd <- paste( "cd ", tmpfolder, " && curl ftp://ftp.ensembl.org/pub/release-", ensversion, "/fasta/", myrefid, "/dna/", myrefid.path, " > seq.fa.gz", sep="" )
system( tempcmd )
There's got to be a smarter way to do that, but it fixed the issue for me, allowing the main Perl script to keep running ...
Yah it seems the Ensembl folder structure / names are not what's expected by the download code, for release 84 and 104 (that I've checked)
Hi,
I'm facing a similar issue. I managed to get ericscript.pl --printdb
to work, but it's listing only few organisms for me. I checked the ".ftplist1" file and it appears to be incomplete. Could you please let me know from where I can get this file or is there any pre-built reference available for GRCh37? (I've checked this link (https://sites.google.com/site/bioericscript/download), but it's not available now.)
ericscript.pl --printdb
Warning message:
In readLines(file.path(ericscriptfolder, "lib", "data", "_resources", :
incomplete final line found on '/media/syncNGS/miniconda3/envs/ericscript/share/ericscript-0.5.5-5/lib/data/_resources/.ftplist1'
Current Ensembl version: 110
Installed Ensembl version: No database installed
Available reference IDs:
acanthochromis_polyacanthus
accipiter_nisus
ailuropoda_melanoleuca
amazona_collaria
amphilophus_citrinellus
amphiprion_ocellaris
amphiprion_percula
anabas_testudineus
anas_platyrhynchos
Hi @smsrts,
Thanks for the updated ericscript.pl
I ran
ericscript.pl --printdb
and it's taking forever.Output:
Then, also tried:
Output
Output:
Thanks.