dib-lab / dammit

just annotate it, dammit!
http://dib-lab.github.io/dammit/
Other
88 stars 28 forks source link

BUSCO v2 database download fails, but gets marked as done #161

Open eburgueno opened 4 years ago

eburgueno commented 4 years ago

Trying to download any BUSCO databases at the moment fails because curl is not following HTTP redirections:

$ dammit databases --database-dir /scratch/dammit/databases --install --busco-group eukaryota
# dammit
## a tool for easy de novo transcriptome annotation
by Camille Scott
**v1.2**, 2018
(...)

- [ ] download_and_untar:busco2db-eukaryota: 
    * Cmd: `mkdir -p /scratch/dammit/databases/busco2db; curl https://busco.ezlab.org/v2/datasets/eukaryota_odb9.tar.gz | tar -xz -C /scratch/dammit/databases/busco2db`
    * Cmd: `touch /scratch/dammit/databases/busco2db/download_and_untar:busco2db-eukaryota.done`
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    81  100    81    0     0    205      0 --:--:-- --:--:-- --:--:--   205

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

When running the same command using curl -L the database gets downloaded correctly:

$ curl -L https://busco.ezlab.org/v2/datasets/eukaryota_odb9.tar.gz | tar -xz -C /scratch/dammit/databases/busco2db
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    81  100    81    0     0    197      0 --:--:-- --:--:-- --:--:--   197
100 12.6M  100 12.6M    0     0  1606k      0  0:00:08  0:00:08 --:--:-- 2766k

This might belong as a separate issue, but I'll also mention it here because it's related: download_and_untar:busco2db-eukaryota.done gets created despite the curl | tar pipeline having failed. This makes dammit think that the databases are actually there when in fact they are not.