amkozlov / raxml-ng

RAxML Next Generation: faster, easier-to-use and more flexible
GNU Affero General Public License v3.0
376 stars 62 forks source link

Request for informative errors + raxml-ng: /opt/conda/conda-bld/raxml-ng_1569616353378/work/src/TreeInfo.cpp:360: double TreeInfo::optimize_params(int, double): Assertion `cur_loglh - new_loglh < -new_loglh * 1e-14' failed. #95

Open giyany opened 4 years ago

giyany commented 4 years ago

Hi,

I've been using raxml-ng 0.9 on ParGenes on a cluster for multiple data-sets. While fast and convenient, I often run into errors that are incomprehensible to me and seem to appear due to model misspecification. It would be immensely helpful and probably save users hours of troubleshooting if the errors were more informative (especially when using ParGenes and having to scroll through many logs to find the offending command).

for example, I ran into trouble with this specific fasta file when running a GTR+GAMMA model with ascertainment bias correction:

x9999032419.min4.fasta.txt

raxml-ng --msa x9999032419.min4.fasta --threads 2 --outgroup P61 --model GTR+G+ASC_LEWIS --bs-trees 250 --all

and the process crashed during BS analysis with the following error:

raxml-ng: /opt/conda/conda-bld/raxml-ng_1569616353378/work/src/TreeInfo.cpp:360: double TreeInfo::optimize_params(int, double): Assertion `cur_loglh - new_loglh < -new_loglh * 1e-14' failed.

removing ascertainment bias from the model "solved" this issue (and hopefully will help other users googling their way..). There were other issues of a similar nature, but if we could have errors pointing us in the direction of checking our models that would be great.

amkozlov commented 4 years ago

Hi @giyany,

thanks for your feedback! Although in this particular case, the problem seems to be not the model misspecification, and hence removing +ASC_LEWIS was just a workaround, not a real solution. I also suspect this problem might have been fixed in the most recent version.

Could you please post raxml-ng log file for this run?

giyany commented 4 years ago

Absolutely:

RAxML-NG v. 0.9.0 released on 20.05.2019 by The Exelixis Lab. Developed by: Alexey M. Kozlov and Alexandros Stamatakis. Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth. Latest version: https://github.com/amkozlov/raxml-ng Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml

RAxML-NG was called at 16-Jun-2020 13:19:44 as follows:

raxml-ng --msa x9999032419.min4.fasta.snp_sites.aln --threads 5 --outgroup P61 --model GTR+G+ASC_LEWIS --bs-trees 250 --all --redo

Analysis options: run mode: ML tree search + bootstrapping (Felsenstein Bootstrap) start tree(s): random (10) + parsimony (10) bootstrap replicates: 250 random seed: 1592306384 tip-inner: OFF pattern compression: ON per-rate scalers: OFF site repeats: ON branch lengths: proportional (ML estimate, algorithm: NR-FAST) SIMD kernels: AVX2 parallelization: PTHREADS (5 threads), thread pinning: OFF

WARNING: Running in REDO mode: existing checkpoints are ignored, and all result files will be overwritten!

[00:00:00] Reading alignment from file: x9999032419.min4.fasta.snp_sites.aln [00:00:00] Loaded alignment with 72 taxa and 1385 sites

Alignment comprises 1 partitions and 1145 patterns

Partition 0: noname Model: GTR+FO+G4m+ASC_LEWIS Alignment sites / patterns: 1385 / 1145 Gaps: 2.61 % Invariant sites: 0.00 %

NOTE: Binary MSA file created: x9999032419.min4.fasta.snp_sites.aln.raxml.rba

[00:00:00] Generating 10 random starting tree(s) with 72 taxa [00:00:00] Generating 10 parsimony starting tree(s) with 72 taxa [00:00:00] Data distribution: max. partitions/sites/weight per thread: 1 / 229 / 3664

Starting ML tree search with 20 distinct starting trees

[00:00:15] ML tree search #1, logLikelihood: -7459.535727 [00:00:33] ML tree search #2, logLikelihood: -7458.007170 [00:00:48] ML tree search #3, logLikelihood: -7463.850172 [00:01:05] ML tree search #4, logLikelihood: -7463.361264 [00:01:22] ML tree search #5, logLikelihood: -7460.381088 [00:01:38] ML tree search #6, logLikelihood: -7458.201110 [00:01:54] ML tree search #7, logLikelihood: -7457.530134 [00:02:10] ML tree search #8, logLikelihood: -7464.874068 [00:02:28] ML tree search #9, logLikelihood: -7458.575231 [00:02:43] ML tree search #10, logLikelihood: -7465.032920 [00:03:00] ML tree search #11, logLikelihood: -7458.366839 [00:03:16] ML tree search #12, logLikelihood: -7452.565653 [00:03:31] ML tree search #13, logLikelihood: -7458.616442 [00:03:46] ML tree search #14, logLikelihood: -7452.000136 [00:04:00] ML tree search #15, logLikelihood: -7456.732411 [00:04:16] ML tree search #16, logLikelihood: -7457.753511 [00:04:30] ML tree search #17, logLikelihood: -7464.461245 [00:04:46] ML tree search #18, logLikelihood: -7459.924070 [00:05:00] ML tree search #19, logLikelihood: -7457.343260 [00:05:16] ML tree search #20, logLikelihood: -7458.614270

[00:05:16] ML tree search completed, best tree logLH: -7452.000136

[00:05:16] Starting bootstrapping analysis with 250 replicates.

The error isn't in the raxml log, but appears 3 times in stdout.

BenoitMorel commented 4 years ago

Hi @giyany,

beyond this specific crash, you are right that we have a problem in ParGenes when raxml-ng crashes (for instance an assertion error). In the default mode, it's very difficult for ParGenes to detect such an error and to point the user to the specific failing run.

A safer way of running ParGenes is to call pargenes-hpc-debug.py instead of pargenes-hpc.py. The debug mode does not allow individual raxml runs to parallelize over the sites, which has an impact on the runtime only if you have large sequences and few families. In this mode, if one run fails, ParGenes should be able to continue running the other jobs, and will report the raxml-ng runs that failed.

Best, Benoit

giyany commented 4 years ago

Hi @BenoitMorel, thanks for chiming in and for the tip. I used pargenes-hpc-debug.py and it's running smoothly so far.

amkozlov commented 4 years ago

@giyany thanks for sharing the log file! I double-checked and I cannot reproduce the error with the most recent github version, so I assume this particular problem has been already fixed.

Please let me know if you identified further crashes with other datasets.

YuejiaoHuang commented 3 years ago

Hi,

When I installed the RAxML-NG 0.9 through conda, I ran the following command: raxml-ng --bootstrap --msa rename_mse_241.fasta --msa-format FASTA --data-type DNA --seed 111 --model TVM+F+R9 --bs-trees 1000 --threads 20 --prefix T1.

I got the similar error: raxml-ng: /opt/conda/conda-bld/raxml-ng_1569616353378/work/src/TreeInfo.cpp:352: double TreeInfo::optimize_params(int, double): Assertion `cur_loglh - new_loglh < -new_loglh * 1e-14' failed.

Would you mind give me some suggestions? Should I try to install it on the newest version or the command had any mistake?

amkozlov commented 3 years ago

yes, please install the latest version (1.0.2) which is also available in bioconda:

https://anaconda.org/bioconda/raxml-ng

version 0.9 is very old, and lots of bugs have been fixed since then