Closed dlaehnemann closed 11 months ago
hm this inconsistency wrt number of threads look weird, could you please post full log file?
oh wait there is a syntax error on the command line, you must specify on
or off
for the --prob-msa
switch:
--prob-msa on | off use probabilistic alignment (works with CATG and VCF)
I fixed this, but the error remains the same. Here is a more complete log (but I skip all the reading in of taxa and sites):
RAxML-NG v. 1.2.0 released on 09.05.2023 by The Exelixis Lab.
Developed by: Alexey M. Kozlov and Alexandros Stamatakis.
Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth, Julia Haag, Anastasis Togkousidis.
Latest version: https://github.com/amkozlov/raxml-ng
Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml
System: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, 20 cores, 125 GB RAM
RAxML-NG was called at 22-Jun-2023 13:20:31 as follows:
raxml-ng --all --msa results/raxml_ng_input/some_sample.ml_gt_and_likelihoods.catg --model GTGTR+FO --prefix results/raxml_ng/some_sample --prob-msa on --threads 16 --tree pars{1} --log DEBUG
Analysis options:
run mode: ML tree search + bootstrapping (Felsenstein Bootstrap)
start tree(s): parsimony (1)
bootstrap replicates: parsimony (max: 1000) + bootstopping (autoMRE, cutoff: 0.030000)
random seed: 1687432831
tip-inner: OFF
pattern compression: OFF
per-rate scalers: OFF
site repeats: OFF
logLH epsilon: general: 10.000000, brlen-triplet: 1000.000000
fast spr radius: AUTO
spr subtree cutoff: 1.000000
fast CLV updates: ON
branch lengths: proportional (ML estimate, algorithm: NR-FAST)
SIMD kernels: AVX
parallelization: coarse-grained (auto), PTHREADS (16 threads), thread pinning: OFF
RBA partial loading: OFF
|noname| |GT10GTR+FO| ||
[00:00:00] Reading alignment from file: results/raxml_ng_input/control.ml_gt_and_likelihoods.catg
Failed to load as IPHYLIP: Unable to parse PHYLIP file: results/raxml_ng_input/control.ml_gt_and_likelihoods.catg
(LIBPLL-233): Sequence 2 (MMMAMMNMNAAAAAAANNNNNNNAANNNNNNN) data out of alignment
Failed to load as PHYLIP: Unable to parse PHYLIP file: results/raxml_ng_input/control.ml_gt_and_likelihoods.catg
(LIBPLL-232): Sequence 1 (CATo8) longer than expected
Failed to load as FASTA: Error parsing FASTA file: results/raxml_ng_input/control.ml_gt_and_likelihoods.catg
(LIBPLL-203): Illegal header line in query fasta file
Failed to load as FASTA (long labels): Error parsing FASTA file: results/raxml_ng_input/control.ml_gt_and_likelihoods.catg
(LIBPLL-203): Illegal header line in query fasta file
CATG: taxa: 32, sites: 49064
CATG: taxon 0: CATo8
[...]
CATG: taxon 31: CAB1
CATG: site 0 consesus seq: MMMAMMNMNAAAAAAANNNNNNNAANNNNNNN
CATG: number of states: 01-Jan-1970 01:00:10
CATG: site 1 consesus seq: MMMMMCNNNMMNMMCMNNNNNNNNMNNMNNMN
CATG: site 2 consesus seq: MMMAMMMAAMNNMMMANNNNNNNNNNNNNAAN
[...]
CATG: site 49063 consesus seq: NNNNNNNNNNNNNNNNNNNNNNNNNTNNKNNN
[00:00:03] Loaded alignment with 32 taxa and 49064 sites
[00:00:03] Extracting partitions...
[00:00:03] Checking the alignment...
Alignment comprises 1 partitions and 49064 sites
Partition 0: noname
Model: GT10GTR+FO
Alignment sites: 49064
Gaps: 49.88 %
Invariant sites: 0.00 %
Recommended threads (response/balanced/throughput): 25 / 10 / 9
Parallelization scheme autoconfig: 1 worker(s) x 16 thread(s)
[00:00:03] Generating a RANDOM starting tree, seed: 502175453
[00:00:03] Generating 1 parsimony starting tree(s) with 32 taxa
Estimated memory per parsimony thread: 7 MB
Parallel parsimony with 16 threads
[00:00:03] [worker #0] Generated a PARSIMONY starting tree, seed: 1904568126, score: 197219
Parallel reduction/worker buffer size: 1 KB / 0 KB
ERROR: vector::_M_default_append
thanks but I can't reproduce it, could you please send me your input file?
I'll try to produce a minimal triggering example, as I cannot share the full data publicly (without controlled access). This will hopefully also help narrow down the bug (or data issue) search further.
It seems like I can't get this to trigger with any reduced version of the data. All of these will run until past the previous error point:
49064
to 49063
) without actually removing a (site) entry / line already gets the analysis past this line.So it doesn't seem to be a particular line that is triggering this, but rather something like the number and size of records.
In addition, I found out that changing the command from --tree pars{1}
to --tree pars{2}
causes raxml-ng
to fail even earlier, with:
CATG: site 49063 consesus seq: NNNNNNNNNNNNNNNNNNNNNNNNNTNNKNNN
[00:00:49] Loaded alignment with 32 taxa and 49064 sites
[00:00:49] Extracting partitions...
[00:00:49] Checking the alignment...
Alignment comprises 1 partitions and 49064 sites
Partition 0: noname
Model: GT10GTR+FO
Alignment sites: 49064
Gaps: 49.88 %
Invariant sites: 0.00 %
Recommended threads (response/balanced/throughput): 25 / 10 / 9
Parallelization scheme autoconfig: 1 worker(s) x 1 thread(s)
[00:00:49] Generating a RANDOM starting tree, seed: 661317336
raxml-ng: /opt/conda/conda-bld/raxml-ng_1686044823122/work/src/main.cpp:1258: void load_start_trees(RaxmlInstance&): Assertion `i == instance.opts.num_searches' failed.
This at least has a minimal backtrace that points to some raxml-ng
code, so maybe this helps you understand what's going on? Interestingly, it states Generating a RANDOM starting tree
here (and in the original error), even though only --tree pars{1|2}
was requested.
And finally, here's at least one example record so you roughly know what my data looks like (I altered some taxon entries, but the general format should be clear):
RRRRRRRRRRRRARAANNNNRANARNNRRNAN 7.516919868066907e-05,0.0,0.0006961930193938315,0.0,0.0,0.9992288922614705,0.0,0.0,0.0,0.0 0.07238359749317169,0.0,2.6162499125348404e-05,0.0,0.0,0.9275895592581946,0.0,0.0,0.0,0.0 5.657959991367534e-05,0.0,0.0005624190089292824,0.0,0.0,0.9993802037460426,0.0,0.0,0.0,0.0 1.1723000170604791e-05,0.0,0.000673658971209079,0.0,0.0,0.9993150746452208,0.0,0.0,0.0,0.0 0.0005221319734118879,0.0,0.00044177399831824005,0.0,0.0,0.9990369205688694,0.0,0.0,0.0,0.0 0.10272099822759628,0.0,5.522049832507037e-06,0.0,0.0,0.897273358918028,0.0,0.0,0.0,0.0 0.2248540073633194,0.0,1.1224799891351722e-05,0.0,0.0,0.7751347868470475,0.0,0.0,0.0,0.0 1.0774800102808513e-05,0.0,0.0007209079922176898,0.0,0.0,0.9992684416459561,0.0,0.0,0.0,0.0 6.421249736376922e-07,0.0,0.0007396049913950264,0.0,0.0,0.9992593830213183,0.0,0.0,0.0,0.0 0.013520999811589718,0.0,0.00010749499779194593,0.0,0.0,0.9863714732455264,0.0,0.0,0.0,0.0 1.5063299940720754e-07,0.0,0.0007437890162691474,0.0,0.0,0.9992558822072304,0.0,0.0,0.0,0.0 0.09034000337123871,0.0,5.719130058423616e-06,0.0,0.0,0.9096553092240356,0.0,0.0,0.0,0.0 0.5612149834632874,0.0,5.508579761226429e-07,0.0,0.0,0.4387846173485741,0.0,0.0,0.0,0.0 0.004178300034254789,0.0,0.0002092140057357028,0.0,0.0,0.9956126061897521,0.0,0.0,0.0,0.0 0.5382919907569885,0.0,5.978749868518207e-07,0.0,0.0,0.46170735149644315,0.0,0.0,0.0,0.0 0.5382919907569885,0.0,5.978749868518207e-07,0.0,0.0,0.46170735149644315,0.0,0.0,0.0,0.0 0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 1.132950001192512e-05,0.0,0.0006479779840447009,0.0,0.0,0.9993400698736252,0.0,0.0,0.0,0.0 0.9833009839057922,0.0,9.828800273670169e-11,0.0,0.0,0.016699400605276082,0.0,0.0,0.0,0.0 0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 0.78541499376297,0.0,1.6060899952208274e-06,0.0,0.0,0.2145843373145908,0.0,0.0,0.0,0.0 3.318259871321061e-07,0.0,0.0007419249741360545,0.0,0.0,0.9992573255024535,0.0,0.0,0.0,0.0 0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 0.0001326689962297678,0.0,0.0006849199999123812,0.0,0.0,0.9991819075194144,0.0,0.0,0.0,0.0 2.0807999590033432e-07,0.0,0.0007430710247717798,0.0,0.0,0.9992566032654031,0.0,0.0,0.0,0.0 0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 0.9050719738006592,0.0,1.0040400155730822e-07,0.0,0.0,0.09492797502025496,0.0,0.0,0.0,0.0 0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1
Other than that, I am out of debugging ideas for now. Could I compile raxml-ng
with some option to get a better backtrace or even more debugging infos?
For now, I'll probably side-step the issue by changing my input filtering. But I'm keeping the erroring input file around, in case you have further debugging ideas.
Thanks for your debugging efforts & detailed report!
So it seems that the original error does only occur under very specific and rare circumstances, which is generally good news :) Still, the easiest way to debug would be if you could send me your (anonymized) input file, you can e,g. obscure the taxon names.
The second error with --tree pars{2}
is a "known bug": raxml-ng
just loaded the checkpoint file from the old run with --tree pars{1}
, and then noticed that the number of starting trees does not match. So the problem can be fixed by simply adding --redo
, although the error message should definitely be improved.
Sorry for the late follow-up, and thanks for pointing me to the checkpointing as a potential problem. Even with different filtering, the original error persistet.
But it appears that a remaining checkpointing file from some previous run was causing the original error, as well. Removing the checkpointing file allowed the command to run through.
Maybe the picking up from a previous checkpoint should not be the default behaviour? Or raxml-ng
should at least warn about detecting one? Or check for consistency of the parameters from the original run and the current one (or does it do that already)? But this is just me thinking out loud, here...
The error comes up when running the following command:
It seems to read in the input file correctly (I also ran
raxml-ng --parse
on it beforehand, and that went fine), it generates a starting tree and then throws the errorERROR: vector::_M_default_append
. Here's the last lines ofDEBUG
output (starting from the end of input parsing):The only thing that stands out to me is the difference in
--threads 16
specified in the command and the statementParallel parsimony with 20 threads
. Other than that, I have no clue how to debug this -- it doesn't seem to be my input, as the parsing goes just fine. Any ideas?