DiltheyLab / MetaMaps

Long-read metagenomic analysis
Other
96 stars 23 forks source link

Unable to buildDB #49

Open Electrocyte opened 3 years ago

Electrocyte commented 3 years ago

I am unable to buildDB, initially I suspected it was because the fna files were gunzipped but after un-gzipping them, a new issue arose.

I used the following commands to download a smaller version of the NCBI refseq database, as the command recommended took over 1 week without completion:

perl downloadRefSeq.pl --DB refseq --seqencesOutDirectory download/refseq --taxonomyOutDirectory download/taxonomy --targetBranches bacteria --skipIncompleteGenomes 1
perl annotateRefSeqSequencesWithUniqueTaxonIDs.pl --refSeqDirectory download/refseq --taxonomyInDirectory download/taxonomy --taxonomyOutDirectory download/taxonomy_uniqueIDs
gunzip -r download/refseq/bacteria/
perl buildDB.pl --DB databases/bacteria --FASTAs download/refseq --taxonomy download/taxonomy_uniqueIDs

And I got the following error message after adding the gunzip line:

~/MetaMaps$ perl buildDB.pl --DB databases/bacteria --FASTAs download/refseq --taxonomy download/taxonomy_uniqueIDs

Number of found FASTA input files: 2968

Reading taxonomy from download/taxonomy_uniqueIDs ..
        done.

Expect taxon ID in contig identifier - file download/refseq/bacteria/Nitratireductor_sp__SY7/GCF_007922615.2_ASM792261v2/GCF_007922615.2_ASM792261v2_genomic.fna - line 1 at /home/x/MetaMaps/perlLib/Util.pm line 53, <F> line 1.

As a side note, without the gunzip command step, I get:

Number of found FASTA input files: 0

And it fails differently.

tim488 commented 3 years ago

Same here. Would be nice if somebody could answer this, I think this is a rather serious issue. @AlexanderDilthey, @froggleston, @cjain7 any idea?

AlexanderDilthey commented 3 years ago

Sorry for being slow to respond - my group is swamped with COVID-related projects at the moment. I'll look into this and try to fix ASAP!

froggleston commented 3 years ago

Hi @Electrocyte - could you paste the first line of download/refseq/bacteria/Nitratireductor_sp__SY7/GCF_007922615.2_ASM792261v2/GCF_007922615.2_ASM792261v2_genomic.fna please?

And could you attach the full output of each command as it's run? It sounds like the download and building isn't working as it should, hence you having to unzip manually...

tim488 commented 3 years ago

After manually unzipping I get:

perl buildDB.pl --DB databases/bacteria --FASTAs download/refseq/bacteria/ --taxonomy download/taxonomy_uniqueIDs

Number of found FASTA input files: 19038

Reading taxonomy from download/taxonomy_uniqueIDs ..
    done.

Expect taxon ID in contig identifier - file download/refseq/bacteria/Bacillus_cereus/GCF_000835185.1_ASM83518v1/GCF_000835185.1_ASM83518v1_genomic.fna - line 1 at /sybig/home/cah/MetaMaps/perlLib/Util.pm line 53, <F> line 1.

The respective File starts with: >NZ_CP009605.1 Bacillus cereus strain S2-8 chromosome, complete genome I had a look in the Code and it fails while checking for /kraken:taxid\|(x?\d+)/ which obviously isn't there... I then gave up because I didn't want to spend so much time on it^^

I will run the other commands and post the output here, it may take a while though as I am not in the office...

Electrocyte commented 3 years ago

I am missing the bacillus cereus genome: Bacillus_cereus_AH820/ Bacillus_cereus_C1L/ fourier:~/MetaMaps/download$ cd refseq/bacteria/Bacilluscereus However the first lines for nitra are: fourier: ~/MetaMaps/download/refseq/bacteria/Nitratireductor_sp__SY7/GCF_007922615.2_ASM792261v2$ head GCF_00792261 5.2_ASM792261v2_genomic.fna

NZ_CP042301.2 Nitratireductor sp. SY7 chromosome, complete genome CCCTCCTGTCCCGAGAGGTCATGGTTGCATTCGAGAGCGCCTGCGCGGAGCGCGCCGAGCGGGGCGAAAGCGTGGAGCTT ACCCTGATTGGAATTGATACAGTCGATGTGGTGCGTGCGAACGTAAATTCCGAGTCGATGGAAGTGACGCTCCGGTTTCA CACGCAAATGGTTTGGGTGGAGCGCAACGCTGAAGGGACGATTGTCGGCGGTGATCCTGCGGAGGTGGCCGATATGGTGG ACACATGGACATTTGCCCGACCTGTTCCGGTCTCGAGCAATACGTGGGCCGTCGTCGCGACCGGCCAATAACTGCACCTA TTGCCCGCCGCTTTGTCGCGCCAAGAAACATTGTGCGGTCTGGACTCGGCGCGAGCTCCTTGAGTCATGGATCAATCGCC GCAGTTAGTTCTGGGAGGCCGGGGGGGCTGCCGACAAGCGCCCGCAAGCCGTCGTCGTTCGAGGGTTCTTGATAAACCGG GCTTGAGACGGACCTTCATCGCCATCTCGGTTTGTTGCCGATCCCGGCATTCTTTCATGTAAGCGTCGAACAGATCTCAA CTGGCGTTAGGGCGCTTCATGAGCGGAAGCCTCTCGCCTGGGTTGATTTCGCCAGCGTCGCGACAACGCCTGGCTCAGCG AGCGTGGACATGTCGCCCAAATCATTCGTCTGCCCTTCGGCAAGCTTACGCAACACGCGTCGCATAATCTTGCCGGAACG fourier: ~/MetaMaps/download/refseq/bacteria/Nitratireductor_sp__SY7/GCF_007922615.2_ASM792261v2 $

On Mon, 9 Nov 2020 at 19:25, Robert Davey notifications@github.com wrote:

Hi @Electrocyte https://github.com/Electrocyte - could you paste the first line of download/refseq/bacteria/Nitratireductor_sp__SY7/GCF_007922615.2_ASM792261v2/GCF_007922615.2_ASM792261v2_genomic.fna please?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DiltheyLab/MetaMaps/issues/49#issuecomment-723952686, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACR6EQ4Y4L74M3TZUNYXQHLSO7GTJANCNFSM4SOTBSXA .

AlexanderDilthey commented 3 years ago

Did you run the annotateRefSeqSequencesWithUniqueTaxonIDs.pl script (see the second step under "Databases" on the repo's main site)? This should add identifiers of the kraken:taxid kind...

Electrocyte commented 3 years ago

Checking the output dir:

@fourier:~/MetaMaps$ ls download/taxonomy_uniqueIDs -alh total 346M drwxrwxr-x 2 4.0K Oct 12 19:00 . drwxrwxr-x 5 4.0K Oct 13 18:29 .. -rw-r--r-- 1 4.0M Oct 13 17:59 delnodes.dmp -rw-r--r-- 1 1.1M Oct 13 17:59 merged.dmp -rw-r--r-- 1 191M Oct 13 17:59 names.dmp -rw-r--r-- 1 150M Oct 13 17:59 nodes.dmp

These are the commands I used: perl downloadRefSeq.pl --seqencesOutDirectory download/refseq --taxonomyOutDirectory download/taxonomy --targetBranches bacteria perl downloadRefSeq.pl --DB refseq --seqencesOutDirectory download/refseq --taxonomyOutDirectory download/taxonomy --targetBranches bacteria --skipIncompleteGenomes 1 perl annotateRefSeqSequencesWithUniqueTaxonIDs.pl --refSeqDirectory download/refseq --taxonomyInDirectory download/taxonomy --taxonomyOutDirectory download/taxonomy_uniqueIDs gunzip -r download/refseq/bacteria/ perl buildDB.pl --DB databases/bacteria --FASTAs download/refseq --taxonomy download/taxonomy_uniqueIDs

First downloadRefSeq was whole database, and since it was still running after a week, I killed it. Second was to get an abridged RefSeq, which was done in <12h. Let me know if I can help you with anything else?

On Tue, 10 Nov 2020 at 03:18, Alexander Dilthey notifications@github.com wrote:

Did you run the annotateRefSeqSequencesWithUniqueTaxonIDs.pl script (see the second step under "Databases" on the repo's main site)? This should add identifiers of the kraken:taxid kind...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DiltheyLab/MetaMaps/issues/49#issuecomment-724222868, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACR6EQ7T5F4R3TO5A2C47V3SPA6ALANCNFSM4SOTBSXA .

tim488 commented 3 years ago

Hi, me again, sorry for the delay... I recovered the screen from my original run.

After downloading the data (yes it ran a long time and often had to be resumed) I ran in the repos main directory:

perl annotateRefSeqSequencesWithUniqueTaxonIDs.pl --refSeqDirectory download/refseq --taxonomyInDirectory download/taxonomy --taxonomyOutDirectory download/taxonomy_uniq
ueIDs
Reading taxonomy from download/taxonomy ..  
        done.

Scanning /sybig/home/cah/MetaMaps/download/refseq for *_assembly_report.txt
Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125.
Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125.

this line get's repeated many times and than:

Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125.
Taxon ID 2582918 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2583232 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2661838 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2661839 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 1660064 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2587161 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Summary input data:
        Refseq categories:
                Reference Genome: 16
                : 16403
                Representative Genome: 3005
        Assembly levels:
                Scaffold: 226
                Complete Genome: 19103
                Contig: 47
                Chromosome: 48
        {Category} X {Assembly level}:
                Reference Genome_Complete Genome: 16
                Representative Genome_Contig: 47
                Representative Genome_Chromosome: 45
                Representative Genome_Scaffold: 222
                Representative Genome_Complete Genome: 2691
                Undefined_Scaffold: 4
                Undefined_Chromosome: 3
                Undefined_Complete Genome: 16396
Total genomes: 19103
$VAR1 = 'Unexpected rank';
$VAR2 = 'strain';
$VAR3 = '580047';
$VAR4 = [
          '/sybig/home/cah/MetaMaps/download/refseq/bacteria/Chlamydia_trachomatis_A2497/GCF_000284475.1_ASM28447v1/GCF_000284475.1_ASM28447v1_assembly_report.txt',
          '/sybig/home/cah/MetaMaps/download/refseq/bacteria/Chlamydia_trachomatis_A2497/GCF_000226605.1_ASM22660v1/GCF_000226605.1_ASM22660v1_assembly_report.txt'
        ];

And as Electrocyte pointed out with running buildDB.pl I get:

perl buildDB.pl --DB databases/myDB --FASTAs download/refseq --taxonomy download/taxonomy_uniqueIDs
Number of found FASTA input files: 0

Reading taxonomy from download/taxonomy_uniqueIDs ..
        done.

Sadly I wasn't able to recover the output of the download routine, because the screen backlog is not long enough... Please let me know if I can be of any other help :angel:

AlexanderDilthey commented 3 years ago

OK, I think we're dealing with separate problems here.

@tim488, could you do a fresh pull from GitHub and try the annotateRefSeqSequencesWithUniqueTaxonIDs.pl command again? I have pushed a fix that may (hopefully should) have fixed the problem. We may have to iterate a bit in case further taxonomic levels start appearing in the error message.

@Electrocyte, I think something else is going on in your case. Manually unzipping the downloaded files should not be necessary. Could you confirm the following commands work in your environment? They should finish within 10 minutes or so.


perl downloadRefSeq.pl --seqencesOutDirectory download/refseq --taxonomyOutDirectory download/taxonomy --targetBranches fungi --skipIncompleteGenomes 1
perl annotateRefSeqSequencesWithUniqueTaxonIDs.pl --refSeqDirectory download/refseq --taxonomyInDirectory download/taxonomy --taxonomyOutDirectory download/taxonomy_uniqueIDs
perl buildDB.pl --DB databases/myDB --FASTAs download/refseq --taxonomy download/taxonomy_uniqueIDs
´´´
Electrocyte commented 3 years ago

Do you think it would be worth removing the refseq database I have downloaded for future steps?

After running the 3 commands, I get the following:

fourier:~/MetaMaps$ perl downloadRefSeq.pl --seqencesOutDirectory download/refseq --taxonomyOutDirectory download/taxonomy --targetBranches fungi --skipIncompleteGenomes 1

citations.dmp
delnodes.dmp
division.dmp
gencode.dmp
merged.dmp
names.dmp
nodes.dmp
gc.prt
readme.txt

Taxonomy downloaded and extracted into download/taxonomy

Processing 357 entriesNow download genomes for 14 fungi species (14 genomes - refseq - skip incomplete genomes: 1).
         Genome 12 / 14 ; species 12 / 14 fungi (Talaromyces_rugulosus) -- version 1 / 1: GET GCF_013368755.1_ASM1336875v1_assembly_report.txt                        Cannot transfer file Idle timeout (60 seconds): closing control connection
Cannot transfer file GCF_013368755.1_ASM1336875v1_assembly_report.txt: No such file or directory
Cannot transfer file GCF_013368755.1_ASM1336875v1_assembly_report.txt: No such file or directory
Net::FTP>>> Net::FTP(3.10)
Net::FTP>>>   Exporter(5.72)
Net::FTP>>>   Net::Cmd(3.10)
Net::FTP>>>   IO::Socket::SSL(2.060)
Net::FTP>>>     IO::Socket::IP(0.38)
Net::FTP>>>       IO::Socket(1.38)
Net::FTP>>>         IO::Handle(1.36)
Net::FTP=GLOB(0x557ac44fc4e0)<<< 220-
Net::FTP=GLOB(0x557ac44fc4e0)<<<  This warning banner provides privacy and security notices consistent with
Net::FTP=GLOB(0x557ac44fc4e0)<<<  applicable federal laws, directives, and other federal guidance for accessing
Net::FTP=GLOB(0x557ac44fc4e0)<<<  this Government system, which includes all devices/storage media attached to
Net::FTP=GLOB(0x557ac44fc4e0)<<<  this system. This system is provided for Government-authorized use only.
Net::FTP=GLOB(0x557ac44fc4e0)<<<  Unauthorized or improper use of this system is prohibited and may result in
Net::FTP=GLOB(0x557ac44fc4e0)<<<  disciplinary action and/or civil and criminal penalties. At any time, and for
Net::FTP=GLOB(0x557ac44fc4e0)<<<  any lawful Government purpose, the government may monitor, record, and audit
Net::FTP=GLOB(0x557ac44fc4e0)<<<  your system usage and/or intercept, search and seize any communication or data
Net::FTP=GLOB(0x557ac44fc4e0)<<<  transiting or stored on this system. Therefore, you have no reasonable
Net::FTP=GLOB(0x557ac44fc4e0)<<<  expectation of privacy. Any communication or data transiting or stored on this
Net::FTP=GLOB(0x557ac44fc4e0)<<<  system may be disclosed or used for any lawful Government purpose.
Net::FTP=GLOB(0x557ac44fc4e0)<<< 220 FTP Server ready.
Net::FTP=GLOB(0x557ac44fc4e0)>>> USER anonymous
Net::FTP=GLOB(0x557ac44fc4e0)<<< 331 Anonymous login ok, send your complete email address as your password
Net::FTP=GLOB(0x557ac44fc4e0)>>> PASS ....
Net::FTP=GLOB(0x557ac44fc4e0)<<< 230 Anonymous access granted, restrictions apply
Net::FTP=GLOB(0x557ac44fc4e0)>>> TYPE I
Net::FTP=GLOB(0x557ac44fc4e0)<<< 200 Type set to I
Net::FTP=GLOB(0x557ac44fc4e0)>>> CWD /genomes/all/GCF/001/417/885/GCF_001417885.1_Kmar_1.0
Net::FTP=GLOB(0x557ac44fc4e0)<<< 250 CWD command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> PORT 137,132,22,78,181,9
Net::FTP=GLOB(0x557ac44fc4e0)<<< 200 PORT command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> NLST
Net::FTP=GLOB(0x557ac44fc4e0)<<< 150 Opening BINARY mode data connection for file list
Net::FTP=GLOB(0x557ac44fc4e0)<<< 226 Transfer complete
         Genome 13 / 14 ; species 13 / 14 fungi (Kluyveromyces_marxianus_DMKU3_1042) -- version 1 / 1: GET GCF_001417885.1_Kmar_1.0_genomic.fna.gz                        Net::FTP=GLOB(0x557ac44fc4e0)>>> PORT 137,132,22,78,160,39
Net::FTP=GLOB(0x557ac44fc4e0)<<< 200 PORT command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> RETR GCF_001417885.1_Kmar_1.0_genomic.fna.gz
Net::FTP=GLOB(0x557ac44fc4e0)<<< 150 Opening BINARY mode data connection for GCF_001417885.1_Kmar_1.0_genomic.fna.gz (3441865 bytes)
Net::FTP=GLOB(0x557ac44fc4e0)<<< 226 Transfer complete
         Genome 13 / 14 ; species 13 / 14 fungi (Kluyveromyces_marxianus_DMKU3_1042) -- version 1 / 1: GET GCF_001417885.1_Kmar_1.0_genomic.gff.gz                        Net::FTP=GLOB(0x557ac44fc4e0)>>> PORT 137,132,22,78,222,105
Net::FTP=GLOB(0x557ac44fc4e0)<<< 200 PORT command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> RETR GCF_001417885.1_Kmar_1.0_genomic.gff.gz
Net::FTP=GLOB(0x557ac44fc4e0)<<< 150 Opening BINARY mode data connection for GCF_001417885.1_Kmar_1.0_genomic.gff.gz (646157 bytes)
Net::FTP=GLOB(0x557ac44fc4e0)<<< 226 Transfer complete
         Genome 13 / 14 ; species 13 / 14 fungi (Kluyveromyces_marxianus_DMKU3_1042) -- version 1 / 1: GET GCF_001417885.1_Kmar_1.0_protein.faa.gz                        Net::FTP=GLOB(0x557ac44fc4e0)>>> PORT 137,132,22,78,196,155
Net::FTP=GLOB(0x557ac44fc4e0)<<< 200 PORT command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> RETR GCF_001417885.1_Kmar_1.0_protein.faa.gz
Net::FTP=GLOB(0x557ac44fc4e0)<<< 150 Opening BINARY mode data connection for GCF_001417885.1_Kmar_1.0_protein.faa.gz (1604293 bytes)
Net::FTP=GLOB(0x557ac44fc4e0)<<< 226 Transfer complete
         Genome 13 / 14 ; species 13 / 14 fungi (Kluyveromyces_marxianus_DMKU3_1042) -- version 1 / 1: GET GCF_001417885.1_Kmar_1.0_assembly_report.txt                        Net::FTP=GLOB(0x557ac44fc4e0)>>> PORT 137,132,22,78,147,131
Net::FTP=GLOB(0x557ac44fc4e0)<<< 200 PORT command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> RETR GCF_001417885.1_Kmar_1.0_assembly_report.txt
Net::FTP=GLOB(0x557ac44fc4e0)<<< 150 Opening BINARY mode data connection for GCF_001417885.1_Kmar_1.0_assembly_report.txt (2066 bytes)
Net::FTP=GLOB(0x557ac44fc4e0)<<< 226 Transfer complete
Net::FTP=GLOB(0x557ac44fc4e0)>>> CWD /genomes/all/GCF/000/226/115/GCF_000226115.1_ASM22611v1
Net::FTP=GLOB(0x557ac44fc4e0)<<< 250 CWD command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> PORT 137,132,22,78,232,17
Net::FTP=GLOB(0x557ac44fc4e0)<<< 200 PORT command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> NLST
Net::FTP=GLOB(0x557ac44fc4e0)<<< 150 Opening BINARY mode data connection for file list
Net::FTP=GLOB(0x557ac44fc4e0)<<< 226 Transfer complete
         Genome 14 / 14 ; species 14 / 14 fungi (Thermothielavioides_terrestris_NRRL_8126) -- version 1 / 1: GET GCF_000226115.1_ASM22611v1_genomic.fna.gz                        Net::FTP=GLOB(0x557ac44fc4e0)>>> PORT 137,132,22,78,213,105
Net::FTP=GLOB(0x557ac44fc4e0)<<< 200 PORT command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> RETR GCF_000226115.1_ASM22611v1_genomic.fna.gz
Net::FTP=GLOB(0x557ac44fc4e0)<<< 150 Opening BINARY mode data connection for GCF_000226115.1_ASM22611v1_genomic.fna.gz (11663499 bytes)
Net::FTP=GLOB(0x557ac44fc4e0)<<< 226 Transfer complete
         Genome 14 / 14 ; species 14 / 14 fungi (Thermothielavioides_terrestris_NRRL_8126) -- version 1 / 1: GET GCF_000226115.1_ASM22611v1_genomic.gff.gz                        Net::FTP=GLOB(0x557ac44fc4e0)>>> PORT 137,132,22,78,230,89
Net::FTP=GLOB(0x557ac44fc4e0)<<< 200 PORT command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> RETR GCF_000226115.1_ASM22611v1_genomic.gff.gz
Net::FTP=GLOB(0x557ac44fc4e0)<<< 150 Opening BINARY mode data connection for GCF_000226115.1_ASM22611v1_genomic.gff.gz (1644786 bytes)
Net::FTP=GLOB(0x557ac44fc4e0)<<< 226 Transfer complete
         Genome 14 / 14 ; species 14 / 14 fungi (Thermothielavioides_terrestris_NRRL_8126) -- version 1 / 1: GET GCF_000226115.1_ASM22611v1_protein.faa.gz                        Net::FTP=GLOB(0x557ac44fc4e0)>>> PORT 137,132,22,78,214,159
Net::FTP=GLOB(0x557ac44fc4e0)<<< 200 PORT command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> RETR GCF_000226115.1_ASM22611v1_protein.faa.gz
Net::FTP=GLOB(0x557ac44fc4e0)<<< 150 Opening BINARY mode data connection for GCF_000226115.1_ASM22611v1_protein.faa.gz (2841098 bytes)
Net::FTP=GLOB(0x557ac44fc4e0)<<< 226 Transfer complete
         Genome 14 / 14 ; species 14 / 14 fungi (Thermothielavioides_terrestris_NRRL_8126) -- version 1 / 1: GET GCF_000226115.1_ASM22611v1_assembly_report.txt                        Net::FTP=GLOB(0x557ac44fc4e0)>>> PORT 137,132,22,78,154,157
Net::FTP=GLOB(0x557ac44fc4e0)<<< 200 PORT command successful
Net::FTP=GLOB(0x557ac44fc4e0)>>> RETR GCF_000226115.1_ASM22611v1_assembly_report.txt
Net::FTP=GLOB(0x557ac44fc4e0)<<< 150 Opening BINARY mode data connection for GCF_000226115.1_ASM22611v1_assembly_report.txt (1732 bytes)
Net::FTP=GLOB(0x557ac44fc4e0)<<< 226 Transfer complete

Summary for fungi:
        Downloaded species: 14
        Downloaded assemblies: 14

Download for refseq complete. Have 14 assemblies.

Download successful - output directories:
- (sequences)  download/refseq
- (taxonomy)   download/taxonomy

Suggested command for next step:

perl /home/x/MetaMaps/annotateRefSeqSequencesWithUniqueTaxonIDs.pl --refSeqDirectory download/refseq --taxonomyInDirectory download/taxonomy --taxonomyOutDirectory DIR

fourier:~/MetaMaps$ perl annotateRefSeqSequencesWithUniqueTaxonIDs.pl --refSeqDirectory download/refseq --taxonomyInDirectory download/taxonomy --taxonomyOutDirectory download/taxonomy_uniqueIDs
Reading taxonomy from download/taxonomy ..
        done.

Scanning /home/x/MetaMaps/download/refseq for *_assembly_report.txt
No assembly data file - /home/x/MetaMaps/download/refseq/bacteria/Nitratireductor_sp__SY7/GCF_007922615.2_ASM792261v2/GCF_007922615.2_ASM792261v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /home/x/MetaMaps/download/refseq/bacteria/Paenibacillus_polymyxa_CR1/GCF_000507205.3_ASM50720v2/GCF_000507205.3_ASM50720v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /home/x/MetaMaps/download/refseq/bacteria/Enterococcus_faecium_DO/GCF_000174395.2_ASM17439v2/GCF_000174395.2_ASM17439v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /home/x/MetaMaps/download/refseq/bacteria/Helicobacter_cinaedi/GCF_003213725.2_ASM321372v2/GCF_003213725.2_ASM321372v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /home/x/MetaMaps/download/refseq/bacteria/Helicobacter_cinaedi/GCF_902381705.1_UHGG_MGYG-HGUT-01432/GCF_902381705.1_UHGG_MGYG-HGUT-01432_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.

The error continues for a while due to lack of assembly data file.

Summary input data:
        Refseq categories:
                Representative Genome: 12
                Reference Genome: 1
        Assembly levels:
                Complete Genome: 13
        {Category} X {Assembly level}:
                Reference Genome_Complete Genome: 1
                Representative Genome_Complete Genome: 12
Total genomes: 13
Introduced 0 new taxonomic IDs
Annotated 116 contigs

Output new taxonomy into download/taxonomy_uniqueIDs

fourier:~/MetaMaps$ perl buildDB.pl --DB databases/myDB --FASTAs download/refseq --taxonomy download/taxonomy_uniqueIDs

Number of found FASTA input files: 2981

Reading taxonomy from download/taxonomy_uniqueIDs ..
        done.

Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Running count keys in gene-2-protein: 0
Expect taxon ID in contig identifier - file download/refseq/bacteria/Nitratireductor_sp__SY7/GCF_007922615.2_ASM792261v2/GCF_007922615.2_ASM792261v2_genomic.fna - line 1 at /home/x/MetaMaps/perlLib/Util.pm line 53, <F> line 1.

Seems the errors propagate from:

Cannot transfer file GCF_013368755.1_ASM1336875v1_assembly_report.txt: No such file or directory
Cannot transfer file GCF_013368755.1_ASM1336875v1_assembly_report.txt: No such file or directory
AlexanderDilthey commented 3 years ago

Do you think it would be worth removing the refseq database I have downloaded for future steps?

At least for testing purposes I think it would be a good to use a fresh directory for the download and all subsequent steps!

Electrocyte commented 3 years ago

OK, after removing previous database and download folders:

james@fourier:~/MetaMaps$ perl downloadRefSeq.pl --seqencesOutDirectory download/refseq --taxonomyOutDirectory download/taxonomy --targetBranches fungi --skipIncompleteGenomes 1

Unable to close datastream at downloadRefSeq.pl line 77. Cannot transfer file[Net::FTP] Connection closed at downloadRefSeq.pl line 77. james@fourier:~/MetaMaps$ james@fourier:~/MetaMaps$ perl annotateRefSeqSequencesWithUniqueTaxonIDs.pl --refSeqDirectory download/refseq --taxonomyInDirectory download/taxonomy --taxonomyOutDirectory download/taxonomy_uniqueIDs Reading taxonomy from download/taxonomy .. File download/taxonomy/names.dmp missing, download/taxonomy is not a valid taxonomy directory. at /home/james/MetaMaps/perlLib/taxTree.pm line 20. james@fourier:~/MetaMaps$ james@fourier:~/MetaMaps$ perl buildDB.pl --DB databases/myDB --FASTAs download/refseq --taxonomy download/taxonomy_uniqueIDs Please specify a taxonomy directory (parameter --taxonomy) at buildDB.pl line 43. james@fourier:~/MetaMaps$

On Mon, 16 Nov 2020 at 20:52, Alexander Dilthey notifications@github.com wrote:

Do you think it would be worth removing the refseq database I have downloaded for future steps?

At least for testing purposes I think it would be a good to use a fresh directory for the download and all subsequent steps!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DiltheyLab/MetaMaps/issues/49#issuecomment-727958754, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACR6EQ5JC7IQCUV5YUZAP7TSQEN7FANCNFSM4SOTBSXA .

AlexanderDilthey commented 3 years ago

Hi @Electrocyte, I am not sure this is a problem with MetaMaps per se, as the exact same commands execute fine e.g. on my local machine.

Is it possible that your firewall or network settings interfere? Can you manually download from ftp.ncbi.nlm.nih.gov?

tim488 commented 3 years ago

Hi, We installed the new Version, but still had no success. However, we got new output^^ Here ist my Terminal:

$ perl annotateRefSeqSequencesWithUniqueTaxonIDs.pl --refSeqDirectory download/refseq --taxonomyInDirectory download/taxonomy --taxonomyOutDirectory download/taxonomy_uniq
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = (unset),
    LC_ALL = (unset),
    LC_ADDRESS = "de_DE.UTF-8",
    LC_NAME = "de_DE.UTF-8",
    LC_MONETARY = "de_DE.UTF-8",
    LC_PAPER = "de_DE.UTF-8",
    LC_IDENTIFICATION = "de_DE.UTF-8",
    LC_TELEPHONE = "de_DE.UTF-8",
    LC_MEASUREMENT = "de_DE.UTF-8",
    LC_TIME = "de_DE.UTF-8",
    LC_NUMERIC = "en_US.UTF-8",
    LANG = "de_DE.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("de_DE.UTF-8").
Reading taxonomy from download/taxonomy ..
    done.

Scanning /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq for *_assembly_report.txt
Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125.
Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125.

many times repeated

Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125.
Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Terriglobus_saanensis_SP1PR4/GCF_000179915.2_ASM17991v2/GCF_000179915.2_ASM17991v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Spirochaeta_thermophila_DSM_6192/GCF_000147075.1_ASM14707v1/GCF_000147075.1_ASM14707v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Rhodoferax_ferrireducens_T118/GCF_000013605.1_ASM1360v1/GCF_000013605.1_ASM1360v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Brachybacterium_saurashtrense/GCF_003355475.1_ASM335547v1/GCF_003355475.1_ASM335547v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Corynebacterium_halotolerans_YIM_70093___DSM_44683/GCF_000341345.1_ASM34134v1/GCF_000341345.1_ASM34134v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Mycoplasma_hyorhinis_SK76/GCF_000313635.1_ASM31363v1/GCF_000313635.1_ASM31363v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Enterobacter_sp__RHBSTW_00994/GCF_013782625.1_ASM1378262v1/GCF_013782625.1_ASM1378262v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Amycolatopsis_japonica/GCF_000732925.1_ASM73292v1/GCF_000732925.1_ASM73292v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Escherichia_coli_O114_H49/GCF_002741255.1_ASM274125v1/GCF_002741255.1_ASM274125v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Pantoea_sp__MT58/GCF_014495885.1_ASM1449588v1/GCF_014495885.1_ASM1449588v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Magnetospira_sp__QH_2/GCF_000968135.1_ASM96813v1/GCF_000968135.1_ASM96813v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Devosia_sp__MC521/GCF_014127105.1_ASM1412710v1/GCF_014127105.1_ASM1412710v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Lactobacillus_ruminis_ATCC_27782/GCF_000224985.1_ASM22498v1/GCF_000224985.1_ASM22498v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Corynebacterium_pseudotuberculosis_I19/GCF_000152065.3_ASM15206v3/GCF_000152065.3_ASM15206v3_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Salmonella_enterica_subsp__enterica_serovar_Typhimurium_str__SARA13/GCF_000486345.2_ASM48634v2/GCF_000486345.2_ASM48634v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Arcobacter_lekithochrous/GCF_013283835.1_ASM1328383v1/GCF_013283835.1_ASM1328383v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Cupriavidus_necator_H16/GCF_000009285.1_ASM928v2/GCF_000009285.1_ASM928v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Cupriavidus_necator_H16/GCF_004798725.1_ASM479872v1/GCF_004798725.1_ASM479872v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Thermus_sp__CCB_US3_UF1/GCF_000236585.1_ASM23658v1/GCF_000236585.1_ASM23658v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Salmonella_enterica_subsp__enterica_serovar_Typhimurium_str__L_3553/GCF_000828595.1_ASM82859v1/GCF_000828595.1_ASM82859v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Paenibacillus_sp__FSL_R7_0273/GCF_000758625.1_ASM75862v1/GCF_000758625.1_ASM75862v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_000832525.1_ASM83252v1/GCF_000832525.1_ASM83252v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013394245.1_ASM1339424v1/GCF_013394245.1_ASM1339424v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_001941905.1_ASM194190v1/GCF_001941905.1_ASM194190v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002214765.1_ASM221476v1/GCF_002214765.1_ASM221476v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013267775.1_ASM1326777v1/GCF_013267775.1_ASM1326777v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013267255.1_ASM1326725v1/GCF_013267255.1_ASM1326725v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_000978375.1_ASM97837v1/GCF_000978375.1_ASM97837v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_001277915.1_ASM127791v1/GCF_001277915.1_ASM127791v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_001635915.1_ASM163591v1/GCF_001635915.1_ASM163591v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_004771155.1_ASM477115v1/GCF_004771155.1_ASM477115v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002214705.1_ASM221470v1/GCF_002214705.1_ASM221470v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_000635895.2_ASM63589v2/GCF_000635895.2_ASM63589v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_000789315.1_ASM78931v1/GCF_000789315.1_ASM78931v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002813875.1_ASM281387v1/GCF_002813875.1_ASM281387v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_001941885.1_ASM194188v1/GCF_001941885.1_ASM194188v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_006384875.1_ASM638487v1/GCF_006384875.1_ASM638487v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013177495.1_ASM1317749v1/GCF_013177495.1_ASM1317749v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_005707595.1_ASM570759v1/GCF_005707595.1_ASM570759v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_003013315.1_ASM301331v1/GCF_003013315.1_ASM301331v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_009739985.1_ASM973998v1/GCF_009739985.1_ASM973998v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002290105.1_ASM229010v1/GCF_002290105.1_ASM229010v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002215175.1_ASM221517v1/GCF_002215175.1_ASM221517v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013267275.1_ASM1326727v1/GCF_013267275.1_ASM1326727v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_001880305.1_ASM188030v1/GCF_001880305.1_ASM188030v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002216125.1_ASM221612v1/GCF_002216125.1_ASM221612v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_003568565.1_ASM356856v1/GCF_003568565.1_ASM356856v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002000005.1_ASM200000v1/GCF_002000005.1_ASM200000v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013267475.1_ASM1326747v1/GCF_013267475.1_ASM1326747v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_008041975.1_ASM804197v1/GCF_008041975.1_ASM804197v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002220285.1_ASM222028v1/GCF_002220285.1_ASM222028v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_009739965.1_ASM973996v1/GCF_009739965.1_ASM973996v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_004006495.1_ASM400649v1/GCF_004006495.1_ASM400649v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013267455.1_ASM1326745v1/GCF_013267455.1_ASM1326745v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013112375.1_ASM1311237v1/GCF_013112375.1_ASM1311237v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172.
Taxon ID 2582918 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2583232 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2661838 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2661839 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 1660064 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2587161 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Summary input data:
    Refseq categories:
        : 16358
        Reference Genome: 16
        Representative Genome: 2995
    Assembly levels:
        Scaffold: 226
        Contig: 47
        Chromosome: 48
        Complete Genome: 19048
    {Category} X {Assembly level}:
        Representative Genome_Scaffold: 222
        Reference Genome_Complete Genome: 16
        Representative Genome_Contig: 47
        Undefined_Scaffold: 4
        Representative Genome_Complete Genome: 2681
        Undefined_Chromosome: 3
        Undefined_Complete Genome: 16351
        Representative Genome_Chromosome: 45
Total genomes: 19048 
$VAR1 = 'Unexpected rank';
$VAR2 = 'isolate';
$VAR3 = '1206109';
$VAR4 = [
          '/sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Candidatus_Portiera_aleyrodidarum_BT_B_HRs/GCF_000292685.1_ASM29268v1/GCF_000292685.1_ASM29268v1_assembly_report.txt',
          '/sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Candidatus_Portiera_aleyrodidarum_BT_B_HRs/GCF_000300075.1_ASM30007v1/GCF_000300075.1_ASM30007v1_assembly_report.txt'
        ];

(base) tvb@deepthought:/sybig/home/projects/AG_Neesse/programme/MetaMaps_2$ perl buildDB.pl --DB databases/myDB --FASTAs download/refseq --taxonomy download/taxonomy_uniqueIDs
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = (unset),
    LC_ALL = (unset),
    LC_ADDRESS = "de_DE.UTF-8",
    LC_NAME = "de_DE.UTF-8",
    LC_MONETARY = "de_DE.UTF-8",
    LC_PAPER = "de_DE.UTF-8",
    LC_IDENTIFICATION = "de_DE.UTF-8",
    LC_TELEPHONE = "de_DE.UTF-8",
    LC_MEASUREMENT = "de_DE.UTF-8",
    LC_TIME = "de_DE.UTF-8",
    LC_NUMERIC = "en_US.UTF-8",
    LANG = "de_DE.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("de_DE.UTF-8").

Number of found FASTA input files: 0

Reading taxonomy from download/taxonomy_uniqueIDs ..
    done.

$VAR1 = [
          'Annotation files',
          0
        ];
$VAR2 = [
          'Protein files',
          0
        ];
Died at /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/perlLib/taxTree.pm line 315.
Electrocyte commented 3 years ago

Hi, below are two separate attempts, both worked fine, if somewhat slowly.

[image: image.png] Thank you!

On Mon, 7 Dec 2020 at 00:38, tim488 notifications@github.com wrote:

Hi, We installed the new Version, but still had no success. However, we got new output^^ Here ist my Terminal:

$ perl annotateRefSeqSequencesWithUniqueTaxonIDs.pl --refSeqDirectory download/refseq --taxonomyInDirectory download/taxonomy --taxonomyOutDirectory download/taxonomy_uniq perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_ADDRESS = "de_DE.UTF-8", LC_NAME = "de_DE.UTF-8", LC_MONETARY = "de_DE.UTF-8", LC_PAPER = "de_DE.UTF-8", LC_IDENTIFICATION = "de_DE.UTF-8", LC_TELEPHONE = "de_DE.UTF-8", LC_MEASUREMENT = "de_DE.UTF-8", LC_TIME = "de_DE.UTF-8", LC_NUMERIC = "en_US.UTF-8", LANG = "de_DE.UTF-8" are supported and installed on your system. perl: warning: Falling back to a fallback locale ("de_DE.UTF-8"). Reading taxonomy from download/taxonomy .. done.

Scanning /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq for *_assembly_report.txt Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. Exiting subroutine via next at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 125. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Terriglobus_saanensis_SP1PR4/GCF_000179915.2_ASM17991v2/GCF_000179915.2_ASM17991v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Spirochaeta_thermophila_DSM_6192/GCF_000147075.1_ASM14707v1/GCF_000147075.1_ASM14707v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Rhodoferax_ferrireducens_T118/GCF_000013605.1_ASM1360v1/GCF_000013605.1_ASM1360v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Brachybacterium_saurashtrense/GCF_003355475.1_ASM335547v1/GCF_003355475.1_ASM335547v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Corynebacterium_halotolerans_YIM_70093_DSM_44683/GCF_000341345.1_ASM34134v1/GCF_000341345.1_ASM34134v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Mycoplasma_hyorhinis_SK76/GCF_000313635.1_ASM31363v1/GCF_000313635.1_ASM31363v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Enterobacter_sp__RHBSTW_00994/GCF_013782625.1_ASM1378262v1/GCF_013782625.1_ASM1378262v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Amycolatopsis_japonica/GCF_000732925.1_ASM73292v1/GCF_000732925.1_ASM73292v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Escherichia_coli_O114_H49/GCF_002741255.1_ASM274125v1/GCF_002741255.1_ASM274125v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Pantoea_spMT58/GCF_014495885.1_ASM1449588v1/GCF_014495885.1_ASM1449588v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Magnetospira_spQH_2/GCF_000968135.1_ASM96813v1/GCF_000968135.1_ASM96813v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Devosia_spMC521/GCF_014127105.1_ASM1412710v1/GCF_014127105.1_ASM1412710v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Lactobacillus_ruminis_ATCC_27782/GCF_000224985.1_ASM22498v1/GCF_000224985.1_ASM22498v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Corynebacterium_pseudotuberculosis_I19/GCF_000152065.3_ASM15206v3/GCF_000152065.3_ASM15206v3_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Salmonella_enterica_subspenterica_serovar_Typhimurium_strSARA13/GCF_000486345.2_ASM48634v2/GCF_000486345.2_ASM48634v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Arcobacter_lekithochrous/GCF_013283835.1_ASM1328383v1/GCF_013283835.1_ASM1328383v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Cupriavidus_necator_H16/GCF_000009285.1_ASM928v2/GCF_000009285.1_ASM928v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Cupriavidus_necator_H16/GCF_004798725.1_ASM479872v1/GCF_004798725.1_ASM479872v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Thermus_spCCB_US3_UF1/GCF_000236585.1_ASM23658v1/GCF_000236585.1_ASM23658v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Salmonella_enterica_subsp__enterica_serovar_Typhimurium_strL_3553/GCF_000828595.1_ASM82859v1/GCF_000828595.1_ASM82859v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Paenibacillus_sp__FSL_R7_0273/GCF_000758625.1_ASM75862v1/GCF_000758625.1_ASM75862v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_000832525.1_ASM83252v1/GCF_000832525.1_ASM83252v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013394245.1_ASM1339424v1/GCF_013394245.1_ASM1339424v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_001941905.1_ASM194190v1/GCF_001941905.1_ASM194190v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002214765.1_ASM221476v1/GCF_002214765.1_ASM221476v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013267775.1_ASM1326777v1/GCF_013267775.1_ASM1326777v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013267255.1_ASM1326725v1/GCF_013267255.1_ASM1326725v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_000978375.1_ASM97837v1/GCF_000978375.1_ASM97837v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_001277915.1_ASM127791v1/GCF_001277915.1_ASM127791v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_001635915.1_ASM163591v1/GCF_001635915.1_ASM163591v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_004771155.1_ASM477115v1/GCF_004771155.1_ASM477115v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002214705.1_ASM221470v1/GCF_002214705.1_ASM221470v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_000635895.2_ASM63589v2/GCF_000635895.2_ASM63589v2_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_000789315.1_ASM78931v1/GCF_000789315.1_ASM78931v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002813875.1_ASM281387v1/GCF_002813875.1_ASM281387v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_001941885.1_ASM194188v1/GCF_001941885.1_ASM194188v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_006384875.1_ASM638487v1/GCF_006384875.1_ASM638487v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013177495.1_ASM1317749v1/GCF_013177495.1_ASM1317749v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_005707595.1_ASM570759v1/GCF_005707595.1_ASM570759v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_003013315.1_ASM301331v1/GCF_003013315.1_ASM301331v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_009739985.1_ASM973998v1/GCF_009739985.1_ASM973998v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002290105.1_ASM229010v1/GCF_002290105.1_ASM229010v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002215175.1_ASM221517v1/GCF_002215175.1_ASM221517v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013267275.1_ASM1326727v1/GCF_013267275.1_ASM1326727v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_001880305.1_ASM188030v1/GCF_001880305.1_ASM188030v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002216125.1_ASM221612v1/GCF_002216125.1_ASM221612v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_003568565.1_ASM356856v1/GCF_003568565.1_ASM356856v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002000005.1_ASM200000v1/GCF_002000005.1_ASM200000v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013267475.1_ASM1326747v1/GCF_013267475.1_ASM1326747v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_008041975.1_ASM804197v1/GCF_008041975.1_ASM804197v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_002220285.1_ASM222028v1/GCF_002220285.1_ASM222028v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_009739965.1_ASM973996v1/GCF_009739965.1_ASM973996v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_004006495.1_ASM400649v1/GCF_004006495.1_ASM400649v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013267455.1_ASM1326745v1/GCF_013267455.1_ASM1326745v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. No assembly data file - /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Bacillus_cereus/GCF_013112375.1_ASM1311237v1/GCF_013112375.1_ASM1311237v1_genomic.fna.gz at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 172. Taxon ID 2582918 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131. Taxon ID 2583232 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131. Taxon ID 2661838 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131. Taxon ID 2661839 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131. Taxon ID 1660064 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131. Taxon ID 2587161 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131. Summary input data: Refseq categories: : 16358 Reference Genome: 16 Representative Genome: 2995 Assembly levels: Scaffold: 226 Contig: 47 Chromosome: 48 Complete Genome: 19048 {Category} X {Assembly level}: Representative Genome_Scaffold: 222 Reference Genome_Complete Genome: 16 Representative Genome_Contig: 47 Undefined_Scaffold: 4 Representative Genome_Complete Genome: 2681 Undefined_Chromosome: 3 Undefined_Complete Genome: 16351 Representative Genome_Chromosome: 45 Total genomes: 19048 $VAR1 = 'Unexpected rank'; $VAR2 = 'isolate'; $VAR3 = '1206109'; $VAR4 = [ '/sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Candidatus_Portiera_aleyrodidarum_BT_B_HRs/GCF_000292685.1_ASM29268v1/GCF_000292685.1_ASM29268v1_assembly_report.txt', '/sybig/home/projects/AG_Neesse/programme/MetaMaps_2/download/refseq/bacteria/Candidatus_Portiera_aleyrodidarum_BT_B_HRs/GCF_000300075.1_ASM30007v1/GCF_000300075.1_ASM30007v1_assembly_report.txt' ];

(base) tvb@deepthought:/sybig/home/projects/AG_Neesse/programme/MetaMaps_2$ perl buildDB.pl --DB databases/myDB --FASTAs download/refseq --taxonomy download/taxonomy_uniqueIDs perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_ADDRESS = "de_DE.UTF-8", LC_NAME = "de_DE.UTF-8", LC_MONETARY = "de_DE.UTF-8", LC_PAPER = "de_DE.UTF-8", LC_IDENTIFICATION = "de_DE.UTF-8", LC_TELEPHONE = "de_DE.UTF-8", LC_MEASUREMENT = "de_DE.UTF-8", LC_TIME = "de_DE.UTF-8", LC_NUMERIC = "en_US.UTF-8", LANG = "de_DE.UTF-8" are supported and installed on your system. perl: warning: Falling back to a fallback locale ("de_DE.UTF-8").

Number of found FASTA input files: 0

Reading taxonomy from download/taxonomy_uniqueIDs .. done.

$VAR1 = [ 'Annotation files', 0 ]; $VAR2 = [ 'Protein files', 0 ]; Died at /sybig/home/projects/AG_Neesse/programme/MetaMaps_2/perlLib/taxTree.pm line 315.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DiltheyLab/MetaMaps/issues/49#issuecomment-739528024, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACR6EQ3DT55QKFVBYZADZFTSTOXQZANCNFSM4SOTBSXA .

tim488 commented 3 years ago

I can't see the image... @Electrocyte

Electrocyte commented 3 years ago

Attached the image

On Mon, 7 Dec 2020 at 15:31, tim488 notifications@github.com wrote:

I can't see the image... @Electrocyte https://github.com/Electrocyte

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DiltheyLab/MetaMaps/issues/49#issuecomment-739729529, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACR6EQ77GZG52NIMEISZNO3STSAFRANCNFSM4SOTBSXA .

AlexanderDilthey commented 3 years ago

Hey @Electrocyte, I also can't see images... but I see a lot of text output, and this is also informative!

annotateRefSeqSequencesWithUniqueTaxonIDs.pl complains about an unexpected node for one of the leaves - this bit of output here;

$VAR1 = 'Unexpected rank';
$VAR2 = 'isolate';
$VAR3 = '1206109';

This is what I was referring to when I said we may need to iterate a bit - I have now added 'isolate' as a legitimate value for leaf nodes (line 223 of annotateRefSeqSequencesWithUniqueTaxonIDs.pl).

Could you update from GitHub and try again?

If further errors of this kind occur, you could also try to fix annotateRefSeqSequencesWithUniqueTaxonIDs.pl accordingly, and then submit a pull request once it runs fine.

buildDB.pl will not work properly until annotateRefSeqSequencesWithUniqueTaxonIDs.pl has run through without errors.

tim488 commented 3 years ago

Alright, I will get back to you!

tim488 commented 3 years ago

@AlexanderDilthey Hey, me again, sorry I got delayed somewhat. But now I had some time and hit some new problems. I changed line 223 in annotateRefSeqSequencesWithUniqueTaxonIDs.pl to

unless(($thisNode_rank eq 'biotype') or ($thisNode_rank eq 'serogroup') or ($thisNode_rank eq 'isolate') or ($thisNode_rank eq 'serotype') or ($thisNode_rank eq 'species') or ($thisNode_rank eq 'no rank') or ($thisNode_rank eq 'subspecies') or ($thisNode_rank eq 'varietas') or ($thisNode_rank eq 'strain'))

It now gives the following output:

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = (unset),
    LC_ALL = (unset),
    LC_ADDRESS = "de_DE.UTF-8",
    LC_NAME = "de_DE.UTF-8",
    LC_MONETARY = "de_DE.UTF-8",
    LC_PAPER = "de_DE.UTF-8",
    LC_IDENTIFICATION = "de_DE.UTF-8",
    LC_TELEPHONE = "de_DE.UTF-8",
    LC_MEASUREMENT = "de_DE.UTF-8",
    LC_TIME = "de_DE.UTF-8",
    LC_NUMERIC = "en_US.UTF-8",
    LANG = "de_DE.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("de_DE.UTF-8").
Reading taxonomy from download/taxonomy ..
    done.

Taxon ID 2582918 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2583232 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2661838 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2661839 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 1660064 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Taxon ID 2587161 not defined in tree in download/taxonomy - try recovering from merged nodes. at annotateRefSeqSequencesWithUniqueTaxonIDs.pl line 131.
Scanning /sybig/home/projects/AG_Neesse/programme/MetaMaps/download/refseq for *_assembly_report.txt
Summary input data:
    Refseq categories:
        Reference Genome: 16
        : 16403
        Representative Genome: 3005
    Assembly levels:
        Contig: 47
        Complete Genome: 19103
        Scaffold: 226
        Chromosome: 48
    {Category} X {Assembly level}:
        Representative Genome_Scaffold: 222
        Representative Genome_Contig: 47
        Representative Genome_Chromosome: 45
        Undefined_Scaffold: 4
        Reference Genome_Complete Genome: 16
        Undefined_Complete Genome: 16396
        Undefined_Chromosome: 3
        Representative Genome_Complete Genome: 2691
Total genomes: 19424 
Introduced 11866 new taxonomic IDs
Annotated 94511 contigs

Output new taxonomy into download/taxonomy_uniq

For me this looks promising. If I run the buildDB.pl script I get A LOT of lines complaining that something cannot be parsed. I commented the Complete Genome option in the annotateRefSeqSequencesWithUniqueTaxonIDs.pl script(as suggested in the Tutorial) could this be a problem?

Can't parse gene line -- ID=gene-PEX2_032630;Dbxref=GeneID:27675957;Name=PEX2_032630;end_range=8228,.;gbkey=Gene;gene_biotype=protein_coding;locus_tag=PEX2_032630;partial=true;start_range=.,5952 NW_015971233.1 RefSeq gene 5952 8228 . + . ID=gene-PEX2_032630;Dbxref=GeneID:27675957;Name=PEX2_032630;end_range=8228,.;gbkey=Gene;gene_biotype=protein_coding;locus_tag=PEX2_032630;partial=true;start_range=.,5952 at buildDB.pl line 303, <A> line 19.

It then says:


Reading taxonomy from download/taxonomy_uniq
        done.

And then repeats Running count keys in gene-2-protein: 0 A LOT of times.

So there is obviously something of here^^ Do you have any idea?