gjeunen / reference_database_creator

creating reference databases for amplicon sequencing
MIT License
28 stars 8 forks source link

crabs issue in taxonomy downloading and unzipping #43

Closed Gratutu closed 1 year ago

Gratutu commented 1 year ago

When using

crabs db_download -s taxonomy

The code will have the problem with:

downloading taxonomy information nucl_gb.accession2t 100%[===================>] 2.19G 1.01MB/s in 38m 8s
unzipping nucl_gb.accession2taxid.gz...

gzip: nucl_gb.accession2taxid.gz: invalid compressed data--format violated taxdump.tar.gz.1 100%[===================>] 60.84M 837KB/s in 69s
removing intermediary files

Traceback (most recent call last): File "crabs", line 1462, in main() File "crabs", line 1459, in main args.func(args) File "crabs", line 69, in db_download os.remove(file) FileNotFoundError: [Errno 2] No such file or directory: 'gc.prt'

gjeunen commented 1 year ago

Hello @Gratutu,

It seems that there might have been an issue with downloading the file, as the unzipping is reporting an error. The NCBI servers can be quite finicky and require an excellent internet connection. Could you please rerun the crabs db_download -s taxonomy code? Could you also please let me know which version you're working on? I've just ran the code and cannot recreate the error.

Thanks, Gert-Jan

Gratutu commented 1 year ago

Hello Gert-Jan,

Thank you very much for your reply. I have solved this issue by changing the cable to internet connection.

Thank you again!

Best, Gratutu

gjeunen commented 1 year ago

Perfect, thanks!

Gert-Jan