steineggerlab / conterminator

Detection of incorrectly labeled sequences across kingdoms
GNU General Public License v3.0
79 stars 7 forks source link

Error: integer expression expected #9

Closed felipevzps closed 4 years ago

felipevzps commented 4 years ago

Im having problems to predict contamination in my FASTA, so I tried to run conterminator with a short FASTA and mapping file.

command: conterminator dna assemblies/head_Hoang.fasta head_mapping_Hoang.mapping head_conterminator_Hoang_2017 head_Hoang_tmp --threads

After this command, I received this error message Error log: 2020-06-21 19:32:59 URL:https://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz [52453100/52453100] -> "-" [1] head_Hoang_tmp/6791693960183541482/createtaxdb/createindex.sh: line 58: [: : integer expression expected

Output log: Tmp head_Hoang_tmp folder does not exist or is not a directory. createdb assemblies/head_Hoang.fasta head_Hoang_tmp/6791693960183541482/sequencedb

Converting sequences [ Time for merging to sequencedb_h: 0h 0m 12s 573ms Time for merging to sequencedb: 0h 0m 6s 172ms Database type: Nucleotide Time for merging to sequencedb.lookup: 0h 0m 0s 143ms Time for processing: 0h 0m 37s 540ms Tmp head_Hoang_tmp/6791693960183541482/createtaxdb folder does not exist or is not a directory. Download taxdump.tar.gz Database created Remove temporary files splitsequence head_Hoang_tmp/6791693960183541482/sequencedb head_Hoang_tmp/6791693960183541482/db_rev_split --max-seq-len 1000 --sequence-overlap 0 --sequence-split-mode 1 --create-lookup 0 --threads 10 --compressed 1 -v 3

Time for processing: 0h 0m 0s 177ms kmermatcher head_Hoang_tmp/6791693960183541482/db_rev_split head_Hoang_tmp/6791693960183541482/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size 21 --min-seq-id 0.9 --kmer-per-seq 100 --spaced-kmer-mode 1 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 24 -c 0 --max-seq-len 1000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 10 --compressed 0 -v 3

kmermatcher head_Hoang_tmp/6791693960183541482/db_rev_split head_Hoang_tmp/6791693960183541482/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size 21 --min-seq-id 0.9 --kmer-per-seq 100 --spaced-kmer-mode 1 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 24 -c 0 --max-seq-len 1000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 10 --compressed 0 -v 3

Database size: 5 type: Nucleotide

Generate k-mers list for 1 split [=================================================================] 5 0s 0ms

Adjusted k-mer length 24 Sort kmer 0h 0m 0s 0ms Sort by rep. sequence 0h 0m 0s 0ms Time for fill: 0h 0m 0s 0ms Time for merging to pref: 0h 0m 0s 73ms Time for processing: 0h 0m 0s 146ms head_Hoang_tmp/6791693960183541482/pref exists and will be overwritten. crosstaxonfilterorf head_Hoang_tmp/6791693960183541482/sequencedb head_Hoang_tmp/6791693960183541482/db_rev_split_h head_Hoang_tmp/6791693960183541482/pref head_Hoang_tmp/6791693960183541482/pref_cross --blacklist 10239,12908,28384,81077,11632,340016,61964,48479,48510 --kingdoms (2||2157),4751,33208,33090,(2759&&!4751&&!33208&&!33090) --threads 10 -v 3

Loading NCBI taxonomy Loading nodes file ... Done, got 2256093 nodes Loading merged file ... Done, added 58146 merged nodes. Loading names file ... Done Making matrix ... Done Init RMQ ...Done [=================================================================] 5 0s 174ms Time for merging to pref_cross: 0h 0m 0s 916ms Time for processing: 0h 0m 15s 913ms

Short FASTA file: head_Hoang.fasta k25_TRINITY_DN18_c0_g4_i1 len=298 path=[0:0-297] CCGCTGTCGGTACTACATGCGTAGCCGTTTCTATTAAGTTAACCCTCAGCGACCCGAAGGGCAGGGTGCTTTGGGCGAGTTGTGTAAGTTTTAGCCGCTTCGTCCACAACGTACTGCACGGCGATTCTGTTCTATATGACAATCACTTTGGGTTTACAGACATCCCCTGGTGATTGCTGGACCCGAAGGTACCAGAAGCTCAGGGCTAAGGCTGCTTACGCAGCGAGAGCGAATTCCTGGCCGTAGTTTTCGTCATTGGCAACTATAAGAAGTTGCAACAGTGGATTTACGAGTTCTG k25_TRINITY_DN18_c0_g1_i1 len=256 path=[0:0-255] TGTGTCAGATGTGGGGGTGTTTTTTGGGAGGATAAAAGGCTCTTTTGGTCAAGGAGCTAGGTATCAGATGTTTTCGTGGTATTGTTAATTTTCTGTGCTTTTGTTTTCGGTTTGTGCAAGGTATACCATGTCCGACTGTGTGCTGTGTACTCGCCACCGTTCTATAATGTGCTGTATACTCTCGCCACCTGTCTACTATCTCGACCTGTCTGGCTGTCTCTTGGCATAGCAAAATGGTCGCCTAATTTGTTAGCAC k25_TRINITY_DN18_c0_g2_i1 len=805 path=[0:0-804] ACTTTCAGAACACACAGGGACTTATTTTTGTTGTAGACAGCAATGACAGGGAACGTGTTGTTGAGGCTAGAGATGAGCTCCACAGGATGCTGAATGAGGATGAGCTGCGTGACGCCGTGCTGCTTGTATTTGCAAACAAACAAGATCTTCCTAATGCTATGAATGCTGCTGAAATTACTGACAAGCTTGGTCTGCATTCCCTGCGCCAGCGGCACTGGTACATCCAGAGCACTTGTGCTACATCTGGTGAAGGGTTGTACGAGGGGCTTGACTGGCTTTCCAACAACATCGCCAGCAAGTCTTGAAGCTTTTCGGCTGGGCTCCTAGCAAGAGGAAGACGTCTGAACAACTTATGGCTGCTTTCTTTACATATTATTACTGATGAGACACAATCTGCAGTAGATAGTGTGTGTGCTTATGGGGAAAAAAACTTCTGAACTATCTTTTGAAGGATTGATACTAGATTGGATTGTATTTTTGTTTTTGGGTCGGTATCATCGCCTGTTTGGTTCCACAAAACTGACTGTCGGTCATATTTTACCTATGTTCGGAGATGGTATGGTAAATTAAATTGGCATGCCTTATCCTAGGTCTTCACCGATCCAGTGCTGTGGCAGATGTGACATGTGAGTTGCTAGATTTTGTTCTTAGATGGGTAAATGCTTATTACCATGCGTGCATGATTAAGACCTGTTTGATACACATGGCTAGTTACTAGTAAAGACAAAAAAAAATAGTCTGAATTAACTACGGTTGGCTAGCTAGTTGGCTGACTACTAGTTGAATTTAGGGCTAAACACTGCTC k25_TRINITY_DN18_c0_g3_i1 len=524 path=[0:0-523] GCCTCAAACAGGTGGCGAGAGTATACAGCACAATGTAGAACAGTGGCGAGTACACACAGCACGCAGTCGGACATGGCATACCTTGCACAAACCAAAACAGAAGCACAGAAAATTAACAATACCACGAAAACATCTGATACCTAGCTCCTTGACCAAAAGAGCCTTTTATCCTCTCAGAAACACCCCCACATCTGACACGGTCACGACTATATAGTACATAGCTGTAATGGTAGAGGCAGTTTGCACCCAAGTATATCCAGAGTGATTCCATTGGTTGAAGCCCTGCCTAGATGCTTAAGCCTTGTTGGCAATGTTGTTTGAAAGCCAGTCAAGCCCCTCGTACAAACCCTCACCAGAGGTAGCACACGTGCTCTGGATGTACCAGTGCCGCTGGCGCAGCGAGTGCAGGCCAAGCTTATCAGTGATTTCAGCAGCGTTCATGGCATTAGGCAGATCTTGTTTGTTTGCAAACAAACAAGATCTGCCTAATGCCATGAACGCTGCTGAAATCACCGATAAGCCTG k25_TRINITY_DN18_c3_g1_i1 len=360 path=[0:0-359] GTAAGACACTATCAGAGATAGCTGAACAGAGATAGCGTTCAGGTTCAGCTGTGGTAAGCTAGCTAGCTCCAAGCTAGACTCAAAATGGTACGTAGAACACATATATATGCATGGCTGTGCTTGTGTATGTACAAGATAAGTTCTTAATTAGTTTTAGTTCTTTCGGTAGGCAATTTTCAGTATAATAGTTGCTTATTCGTAATTAATATATGCATGGCTTAAAGTAGTTTTGTGCTATAATATGCATGTTTCGAAATATGGATGCAGAGCCGAGTGAAGATTGGACACTGGGGTGGACGAGGAGGGCAACAACGTGACGTGCAGTATGATCCTATCCAACTGGTCCGTGTCATAGTCTAC

Short mapping file: head_mapping_Hoang.mapping k25_TRINITY_DN18_c0_g4_i1 286 k25_TRINITY_DN18_c0_g1_i1 0 k25_TRINITY_DN18_c0_g2_i1 0 k25_TRINITY_DN18_c0_g3_i1 0 k25_TRINITY_DN18_c3_g1_i1 0

How can I complete the analysis of the conterminator?

martin-steinegger commented 4 years ago

@felipevzps your mapping file contains 0 as taxonomical identifier. This might cause issues 0.

k25_TRINITY_DN18_c0_g1_i1 0
k25_TRINITY_DN18_c0_g2_i1 0
k25_TRINITY_DN18_c0_g3_i1 0
k25_TRINITY_DN18_c3_g1_i1 0