BenoitMorel / AleRax

Gene tree - species tree reconciliation from gene tree distributions
18 stars 3 forks source link

[BUG] Three major bugs #8

Closed StefanFlaumberg closed 2 months ago

StefanFlaumberg commented 3 months ago

There are 3 major bugs in the current version:

Bug 1. DL model fails during transfer-guided search with:

src/ale/UndatedDTLMultiModel.hpp:457: void UndatedDTLMultiModel<REAL>::recomputeSpeciesProbabilities() [with REAL = ScaledValue]: Assertion `maxSpeciesId == transferRates.size()' failed.

Source: The bug stems from that in the AleEvaluator::getTransferInformation function the newly spawned UndatedDTLMultiModel evaluations copy their respective RecModelInfo objects from the initial model, which is DL, hence the evaluations set their _dtlRates vector to the length of wrong number of model free parameters. The bug must have appeared since the commit dd01380 when the _dtlRates vector became dependent on model free parameter number.

Bug 2. Since the commit 788f488 species tree inference accuracy drops drastically and the --prune-species-tree option ceases to work. Source: The bug stems from that since this commit both DL and DTL models refer to the MultiModel::onSpeciesTreeChange function for node list update every time the species tree topology changes, however the MultiModel::onSpeciesTreeChange function doesn't refer to the the BaseReconciliationModel::onSpeciesTreeChange function which usually does the update. Thus, no node update is happening.

Bug 3. Since the commit 788f488 species tree search under DTL model with LCA origination occasionally fails due to a segmentation fault. Source: The bug stems from that since this commit the UndatedDTLMultiModel::onSpeciesTreeChange function includes a call of the recomputeSpeciesProbabilities function. However, upon every species tree change the invoked SpeciesTree::onSpeciesTreeChange function will at first call the species tree listeners' UndatedDTLMultiModel::onSpeciesTreeChange function (which now also updates PS,PD,PL,PT etc.) and only then calls for the _lcaCache resetting. Thus, LCA nodes get drawn from the old _lcaCache. I don't see any reason to do recomputeSpeciesProbabilities on every species tree change, as we do it anyway before any likelihood computation by means of the beforeComputeLogLikelihood function.

I'm making a pull request that fixes all of this stuff.

Best, Stefan