DiltheyLab / MetaMaps

Long-read metagenomic analysis
Other
96 stars 23 forks source link

downloadRefSeq.pl fails #54

Closed khoriba closed 12 months ago

khoriba commented 3 years ago

Hi,

I'd like to make my database, but downloadRefSeq.pl does not go to the end. I used the following commands to download my purpose species from NCBI refseq database.

downloadRefSeq.pl --targetBranches archaea,bacteria,fungi,protozoa,viral,plasmid --seqencesOutDirectory ./refseq --taxonomyOutDirectory ./taxonomy

I get an error while dealing with that of bacteria. Copy from just before the error output.

Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Genome 893 / 204017 ; species 189 / 53849 bacteria (Streptomyces_sampsonii) -- version 2 / 2: GET GCF_001865315.1_ASM186531v1_assembly_report.txt Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,195,140). Net::FTP=GLOB(0x4875ea0)>>> RETR GCF_001865315.1_ASM186531v1_assembly_report.txt Net::FTP=GLOB(0x4875ea0)<<< 150 Opening BINARY mode data connection for GCF_001865315.1_ASM186531v1_assembly_report.txt (2057 bytes) Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Net::FTP=GLOB(0x4875ea0)>>> CWD /genomes/all/GCF/000/623/015/GCF_000623015.1_Acin_baum_LAC-4_V1 Net::FTP=GLOB(0x4875ea0)<<< 250 CWD command successful Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,196,68). Net::FTP=GLOB(0x4875ea0)>>> NLST Net::FTP=GLOB(0x4875ea0)<<< 150 Opening BINARY mode data connection for file list Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Genome 894 / 204017 ; species 190 / 53849 bacteria (Acinetobacter_baumannii_LAC_4) -- version 1 / 2: GET GCF_000623015.1_Acin_baum_LAC-4_V1_genomic.fna.gz Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,197,40). Net::FTP=GLOB(0x4875ea0)>>> RETR GCF_000623015.1_Acin_baum_LAC-4_V1_genomic.fna.gz Net::FTP=GLOB(0x4875ea0)<<< 150 Opening BINARY mode data connection for GCF_000623015.1_Acin_baum_LAC-4_V1_genomic.fna.gz (1169068 bytes) Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Genome 894 / 204017 ; species 190 / 53849 bacteria (Acinetobacter_baumannii_LAC_4) -- version 1 / 2: GET GCF_000623015.1_Acin_baum_LAC-4_V1_genomic.gff.gz Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,196,20). Net::FTP=GLOB(0x4875ea0)>>> RETR GCF_000623015.1_Acin_baum_LAC-4_V1_genomic.gff.gz Net::FTP=GLOB(0x4875ea0)<<< 150 Opening BINARY mode data connection for GCF_000623015.1_Acin_baum_LAC-4_V1_genomic.gff.gz (269654 bytes) Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Genome 894 / 204017 ; species 190 / 53849 bacteria (Acinetobacter_baumannii_LAC_4) -- version 1 / 2: GET GCF_000623015.1_Acin_baum_LAC-4_V1_protein.faa.gz Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,196,62). Net::FTP=GLOB(0x4875ea0)>>> RETR GCF_000623015.1_Acin_baum_LAC-4_V1_protein.faa.gz Net::FTP=GLOB(0x4875ea0)<<< 150 Opening BINARY mode data connection for GCF_000623015.1_Acin_baum_LAC-4_V1_protein.faa.gz (735401 bytes) Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Genome 894 / 204017 ; species 190 / 53849 bacteria (Acinetobacter_baumannii_LAC_4) -- version 1 / 2: GET GCF_000623015.1_Acin_baum_LAC-4_V1_assembly_report.txt Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,196,130). Net::FTP=GLOB(0x4875ea0)>>> RETR GCF_000623015.1_Acin_baum_LAC-4_V1_assembly_report.txt Net::FTP=GLOB(0x4875ea0)<<< 150 Opening BINARY mode data connection for GCF_000623015.1_Acin_baum_LAC-4_V1_assembly_report.txt (2461 bytes) Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Net::FTP=GLOB(0x4875ea0)>>> CWD /genomes/all/GCF/000/786/735/GCF_000786735.1_ASM78673v1 Net::FTP=GLOB(0x4875ea0)<<< 250 CWD command successful Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,197,65). Net::FTP=GLOB(0x4875ea0)>>> NLST Net::FTP=GLOB(0x4875ea0)<<< 150 Opening BINARY mode data connection for file list Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Genome 895 / 204017 ; species 190 / 53849 bacteria (Acinetobacter_baumannii_LAC_4) -- version 2 / 2: GET GCF_000786735.1_ASM78673v1_genomic.fna.gz Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,196,160). Net::FTP=GLOB(0x4875ea0)>>> RETR GCF_000786735.1_ASM78673v1_genomic.fna.gz Net::FTP=GLOB(0x4875ea0)<<< 150 Opening BINARY mode data connection for GCF_000786735.1_ASM78673v1_genomic.fna.gz (1175088 bytes) Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Genome 895 / 204017 ; species 190 / 53849 bacteria (Acinetobacter_baumannii_LAC_4) -- version 2 / 2: GET GCF_000786735.1_ASM78673v1_genomic.gff.gz Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,196,180). Net::FTP=GLOB(0x4875ea0)>>> RETR GCF_000786735.1_ASM78673v1_genomic.gff.gz Net::FTP=GLOB(0x4875ea0)<<< 150 Opening BINARY mode data connection for GCF_000786735.1_ASM78673v1_genomic.gff.gz (268737 bytes) Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Genome 895 / 204017 ; species 190 / 53849 bacteria (Acinetobacter_baumannii_LAC_4) -- version 2 / 2: GET GCF_000786735.1_ASM78673v1_protein.faa.gz Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,196,71). Net::FTP=GLOB(0x4875ea0)>>> RETR GCF_000786735.1_ASM78673v1_protein.faa.gz Net::FTP=GLOB(0x4875ea0)<<< 150 Opening BINARY mode data connection for GCF_000786735.1_ASM78673v1_protein.faa.gz (732025 bytes) Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Genome 895 / 204017 ; species 190 / 53849 bacteria (Acinetobacter_baumannii_LAC_4) -- version 2 / 2: GET GCF_000786735.1_ASM78673v1_assembly_report.txt Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,196,137). Net::FTP=GLOB(0x4875ea0)>>> RETR GCF_000786735.1_ASM78673v1_assembly_report.txt Net::FTP=GLOB(0x4875ea0)<<< 150 Opening BINARY mode data connection for GCF_000786735.1_ASM78673v1_assembly_report.txt (1550 bytes) Net::FTP=GLOB(0x4875ea0)<<< 226 Transfer complete Net::FTP=GLOB(0x4875ea0)>>> CWD /genomes/all/GCF/000/623/335/GCF_000623335.1_ASM62333v1 Net::FTP=GLOB(0x4875ea0)<<< 250 CWD command successful Net::FTP=GLOB(0x4875ea0)>>> PASV Net::FTP=GLOB(0x4875ea0)<<< 227 Entering Passive Mode (130,14,250,10,196,118). Net::FTP=GLOB(0x4875ea0)>>> NLST Net::FTP: Net::Cmd::getline(): unexpected EOF on command channel: at ../../src/MetaMaps/downloadRefSeq.pl line 258. Net::FTP: Net::Cmd::_is_closed(): unexpected EOF on command channel: at ../../src/MetaMaps/downloadRefSeq.pl line 252. Net::FTP: Net::Cmd::_is_closed(): unexpected EOF on command channel: at ../../src/MetaMaps/downloadRefSeq.pl line 252. Cannot change working directory into assembly path /genomes/all/GCF/000/451/685/GCF_000451685.2_ASM45168v2 [Net::FTP] Connection closed at ../../src/MetaMaps/downloadRefSeq.pl line 255.

AlexanderDilthey commented 12 months ago

Hi @khoriba,

I have pushed a more robust RefSeq download script. Please give it a try.

Best

Alex