Closed davidcerny closed 5 months ago
I noticed this in an FBD analysis I recently ran, not a single move was ever accepted.
So after investigating, I think everything is behaving correctly, but the default removal probability in BDSTP is set to 1 so unless you set it explicitly r=0
it will not allow SAs
if I set r to 0 in your script I do get a non-zero amount of SAs
I'm guessing the default used to be 0 and then changed somehow ? funnily enough the tutorial (https://revbayes.github.io/tutorials/fbd/fbd_specimen.html) is flat out wrong because it tells you to set r=1
for an FBD analysis which is 1) not doing anything because it's the default 2) the exact opposite of what it should be
@davidcerny can you check that I have it right ? if so we need to decide whether to change the default back and either way fix the doc
I can confirm that using r=1
for BDSTP results in no SAs for my own analyses, whereas r=0
results in a non-zero number of SAs. It looks like using FBDP always has SAs (no r
option).
thank you for testing this !
thinking about this more I wonder if there is a way to detect that you have an incompatibility in your setup like that and give a warning to the user - the issue is that it's two different components but maybe the mvCollapseExpandFossilBranch
could be in charge of checking that the associated tree does allow SAs
Sorry for the delay @bjoelle – I can confirm that setting r=0
inside dnBDSTP
indeed fixes the problem for the dataset and RevBayes version I originally reported in this issue: a variable (and generally nonzero) number of ancestors is sampled throughout the analysis.
That's a really interesting observation about the differences between dnBDSTP
and dnFBDP
. In the latter, the removal probability is indeed hardcoded to 0 at the Rev level:
https://github.com/revbayes/revbayes/blob/21ab167ad3046b7d049dac152089732e0fda08fc/src/revlanguage/distributions/phylogenetics/tree/Dist_FBDP.cpp#L137
This makes perfect sense given the original paper by Gavryushkina et al. (2014), which explicitly states that "the fossilized birth-death process [is ]the process with $\rho$-sampling and zero removal probability". It also makes sense to use dnBDSTP
for epidemiology and dnFBDP
for paleontology – the "T" stands for "treatment" and the "F" stands for "fossilized", after all. But in that case, we need to stop calling them "aliases" (as in the documentation) or "synonyms" (as in NEWS.md
) – clearly they are not that if choosing one or the other makes a difference!
As for the idea of mvCollapseExpandFossilBranch
checking whether sampled ancestors are actually enabled, I like it a lot; I'm just not sure how to implement it. There is no way for a move to access the values of the arguments passed to the distribution, right? Would this require a new flag in TimeTree
?
so there are two levels of issue here
the first is that we have actually 3 different versions of the thing, dnFBDP
, dnBDSTP
and dnPhylodynamicBDP
that all appear to do exactly the same thing but with different removal probabilities, but they don't share code which is super inefficient (and I'm pretty sure a c/p from dnPhylodynamicBDP
is actually what led to the change in behaviour in dnBDSTP
which led to this whole mess)
so ideally I would change the architecture so dnBDSTP
is the master class and doesn't have a default, and then both others can derive from it but run with defaults r=0
and r=1
respectively
second issue is that move checking idea, which I think would need to call the distribution not the tree, but I haven't looked into it
plus as you noted fixing the documentation and tutorial
I am planning on working on this further next week hopefully
That all makes sense to me. Thank you so much for figuring this out Joëlle!
Describe the bug This is a problem I first mentioned on Slack in November 2023, but I wanted to create a GitHub issue for it as well to make sure it doesn't slip through the cracks. I also followed @bredelings's advice to run the example dataset below with multiple seeds to find the one that crashes first.
Recently, I tried to run a tip-dating analysis on a small morphological matrix under the FBD model and a white-noise relaxed clock, and noticed problematic behavior from the
mvCollapseExpandFossilBranch()
move. When theDEBUG_MCMC
preprocessor macro is enabled, the analysis crashes with the following error after ~380 iterations (~96,000 moves):Without the macro, the analysis will seemingly run just fine, but the move will never be accepted, and no ancestors will ever be sampled:
As far as I can tell, this problem is separate from issue #401, another recently described bug having to do with sampled ancestors under the FBD model.
To reproduce This zipped directory contains (1) a character matrix, (2) a taxon data file with tip age ranges, (3) a starting tree, and (4) a script to run the analysis. One absolute path has to be changed in the script.
Expected behavior The
mvCollapseExpandFossilBranch()
move should be accepted at least some of the time.Computer info Red Hat Enterprise Linux 8.4 (Ootpa), RevBayes 1.2.2,
development
branch, commit rapture-2364-ga6a40c (two PRs behind the tip of the branch).