Open tougai opened 3 years ago
I'm having similar issues as described by @tougai. Are there any updates or potential workaround? Thank you!
I am also facing this same issue. I've attached example input files that I derived by altering the examples that came with conterminator. In particular, I have appended the "Human-real1" sequence to the "Virus" sequence. However, this also does not make it past the rescorediagonal command. It doesn't freeze but seems to get stuck processing as there is significant CPU activity at this point. example.fas.txt example.mapping.txt
I am having a similar issue. The command I execute, nohup conterminator dna Ino_0_SCF9.fasta dna.mapping results tmp --threads 30 &
and the content of nohup.out is:
Tmp tmp folder does not exist or is not a directory.
createdb Ino_0_SCF9.fasta tmp/17425683134074217680/sequencedb
Converting sequences
[
Time for merging to sequencedb_h: 0h 0m 0s 14ms
Time for merging to sequencedb: 0h 0m 0s 8ms
Database type: Nucleotide
Time for merging to sequencedb.lookup: 0h 0m 0s 0ms
Time for processing: 0h 0m 0s 99ms
Tmp tmp/17425683134074217680/createtaxdb folder does not exist or is not a directory.
Download taxdump.tar.gz
2022-03-04 16:48:14 URL:https://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz [57450018/57450018] -> "-" [1]
Database created
Remove temporary files
tmp/17425683134074217680/createtaxdb/createindex.sh: line 58: [: : integer expression expected
splitsequence tmp/17425683134074217680/sequencedb tmp/17425683134074217680/db_rev_split --max-seq-len 1000 --sequence-overlap 0 --sequence-split-mode 1 --create-lookup 0 --threads 30 --compressed 1 -v 3
Sequence split mode (--sequence-split-mode 0) and compressed (--compressed 1) can not be combined.
Turn compressed to 0[=================================================================] 1 0s 1ms
Time for merging to db_rev_split_h: 0h 0m 0s 2ms
Time for merging to db_rev_split: 0h 0m 0s 1ms
Time for processing: 0h 0m 0s 13ms
kmermatcher tmp/17425683134074217680/db_rev_split tmp/17425683134074217680/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size 21 --min-seq-id 0.9 --kmer-per-seq 100 --spaced-kmer-mode 1 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 24 -c 0 --max-seq-len 1000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 30 --compressed 0 -v 3
kmermatcher tmp/17425683134074217680/db_rev_split tmp/17425683134074217680/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size 21 --min-seq-id 0.9 --kmer-per-seq 100 --spaced-kmer-mode 1 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 24 -c 0 --max-seq-len 1000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 30 --compressed 0 -v 3
Database size: 17160 type: Nucleotide
Generate k-mers list for 1 split
[=================================================================] 17.16K 0s 47ms
Adjusted k-mer length 24
Sort kmer 0h 0m 0s 54ms
Sort by rep. sequence 0h 0m 0s 31ms
Time for fill: 0h 0m 0s 21ms
Time for merging to pref: 0h 0m 0s 2ms
Time for processing: 0h 0m 0s 189ms
tmp/17425683134074217680/pref exists and will be overwritten.
crosstaxonfilterorf tmp/17425683134074217680/sequencedb tmp/17425683134074217680/db_rev_split_h tmp/17425683134074217680/pref tmp/17425683134074217680/pref_cross --blacklist 10239,12908,28384,81077,11632,340016,61964,48479,48510 --kingdoms (2||2157),4751,33208,33090,(2759&&!4751&&!33208&&!33090) --threads 30 -v 3
Loading NCBI taxonomy
Loading nodes file ... Done, got 2404460 nodes
Loading merged file ... Done, added 66368 merged nodes.
Loading names file ... Done
Making matrix ... Done
Init RMQ ...Done
[=================================================================] 17.16K 0s 16ms
Time for merging to pref_cross: 0h 0m 0s 37ms
Time for processing: 0h 0m 4s 86ms
The job has been running on 30 threads for about 60 hours now. When I check the process ID:
trickman 960071 2984 0.0 4144048 52420 ? Rl Mar04 119592:58 conterminator rescorediagonal tmp/17425683134074217680/db_rev_split tmp/17425683134074217680/db_rev_split tmp/17425683134074217680/pref_cross tmp/17425683134074217680/aln --sub-mat nucl:nucleotide.out,aa:blosum62.out --rescore-mode 2 --wrapped-scoring 0 --filter-hits 0 -e 0.001 -c 0 -a 1 --cov-mode 0 --min-seq-id 0.9 --min-aln-len 100 --seq-id-mode 0 --add-self-matches 0 --sort-results 0 --db-load-mode 0 --threads 30 --compressed 0 -v 3
I can see the step being hung up is rescorediagonal. Are there any solutions or advice to avoid this?
same issue....
same
This should be fixed now. I updated conterminator to the newest version of MMseqs2, which should resolve the issue.
Thanks Martin for looking into this. Unfortunately, when I run both the example that I gave as well as the example that @tougai gave, I receive an error "Error: rescorediagonal step died". I've attached logs of outputs and stderr err_log.txt out_log.txt
Hi. I seem to have the same issue. I'm running conterminator Version: 1.c74b5 in a few eukaryote assemblies (one in each run). And all of them are stuck in the rescorediagonal. I couldn't find a log file but the screenshot of what I can see (it's running in a screen, and annoyingly only lets me see a bit) is the following. This one in particular has been running for more than a week now. Is there anything to do? If I stop it now, can I restart it from the same step (or the following step, if staying in the rescorediagonal is a glitch?). Thanks.
[=================================================================] 100.00% 1.76K 0s 26ms 19 eta 0s Time for merging to db_rev_split_h: 0h 0m 0s 93ms Time for merging to db_rev_split: 0h 0m 0s 97ms Time for processing: 0h 0m 0s 464ms kmermatcher tmp/6530182093867110841/db_rev_split tmp/6530182093867110841/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size 21 --min-seq-id 0.9 --kmer-per-seq 100 --spaced-kmer-mode 1 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 24 -c 0 --max-seq-len 1000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 128 --compressed 0 -v 3
kmermatcher tmp/6530182093867110841/db_rev_split tmp/6530182093867110841/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size 21 --min-seq-id 0.9 --kmer-per-seq 100 --spaced-kmer-mode 1 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 24 -c 0 --max-seq-len 1000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 128 --compressed 0 -v 3
Database size: 543723 type: Nucleotide
Generate k-mers list for 1 split [=================================================================] 100.00% 543.72K 0s 683ms
Adjusted k-mer length 24 Sort kmer 0h 0m 1s 61ms Sort by rep. sequence 0h 0m 0s 232ms Time for fill: 0h 0m 0s 198ms Time for merging to pref: 0h 0m 0s 165ms Time for processing: 0h 0m 3s 69ms tmp/6530182093867110841/pref exists and will be overwritten. crosstaxonfilterorf tmp/6530182093867110841/sequencedb tmp/6530182093867110841/db_rev_split_h tmp/6530182093867110841/pref tmp/6530182093867110841/pref_cross --blacklist 10239,12908,28384,81077,11632,340016,61964,48479,48510 --kingdoms (2||2157),4751,33208,33090,(2759&&!4751&&!33208&&!33090) --threads 128 -v 3
Loading NCBI taxonomy Loading nodes file ... Done, got 2550769 nodes Loading merged file ... Done, added 75874 merged nodes. Loading names file ... Done Making matrix ... Done Init RMQ ...Done [=================================================================] 100.00% 543.72K 0s 404ms Time for merging to pref_cross: 0h 0m 0s 31ms Time for processing: 0h 0m 4s 846ms rescorediagonal tmp/6530182093867110841/db_rev_split tmp/6530182093867110841/db_rev_split tmp/6530182093867110841/pref_cross tmp/6530182093867110841/aln --sub-mat nucl:nucleotide.out,aa:blosum62.out --rescore-mode 2 --wrapped-scoring 0 --filter-hits 0 -e 0.001 -c 0 -a 1 --cov-mode 0 --min-seq-id 0.9 --min-aln-len 100 --seq-id-mode 0 --add-self-matches 0 --sort-results 0 --db-load-mode 0 --threads 128 --compressed 0 -v 3
hi, i am trying to test conterminator on a very simple file to start, but it freezes at rescorediagonal stage. When i use example files dna.fas and dna.mapping, everything is fine !
here is my fasta file toto.fa:
my mapping file toto.fa.taxidmapping:
chr1 4577
my command line:
conterminator dna toto.fa toto.fa.taxidmapping out tmp
and the log: