CompEvol / beast2

Bayesian Evolutionary Analysis by Sampling Trees
www.beast2.org
GNU Lesser General Public License v2.1
240 stars 84 forks source link

Tip date estimation with BDSKY tree prior and date consistency in DensiTree plots #1108

Closed jjmccollum closed 1 year ago

jjmccollum commented 1 year ago

I am working with a rooted phylogeny (really, a manuscript tradition) whose shape can (ideally) be governed by several constraints:

  1. the origin/root can be dated to a specific range of years;
  2. the tips can be dated to specific ranges of years, and in some cases, they can be fixed to exact dates; and
  3. the phylogeny can be assumed to follow a birth-death model.

I have tentatively chosen to work with the birth-death skyline (BDSKY) tree prior, as it presumably should accommodate all three constraints. I can get the following minimal working example to work:

<beast namespace="beast.pkgmgmt:beast.base.core:beast.base.inference:beast.base.evolution.alignment:beast.base.evolution.tree.coalescent:beast.base.util:beast.base.math:beast.evolution.nuc:beast.base.evolution.operator:beast.base.inference.operator:beast.base.evolution.sitemodel:beast.base.evolution.substitutionmodel:beast.base.evolution.likelihood:beast.base.evolution.speciation:beast.pkgmgmt:beast.base.core:beast.base.inference.parameter" version="2.0">
    <data id="alignment" dataType="nucleotide">
        <sequence taxon="human">
            AGAAATATGTCTGATAAAAGAGTTACTTTGATAGAGTAAATAATAGGAGCTTAAACCCCCTTATTTCTACTAGGACTATGAGAATCGAACCCATCCCTGAGAATCCAAAATTCTCCGTGCCACCTATCACACCCCATCCTAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTTATACCCTTCCCGTACTAAGAAATTTAGGTTAAATACAGACCAAGAGCCTTCAAAGCCCTCAGTAAGTTG-CAATACTTAATTTCTGTAAGGACTGCAAAACCCCACTCTGCATCAACTGAACGCAAATCAGCCACTTTAATTAAGCTAAGCCCTTCTAGACCAATGGGACTTAAACCCACAAACACTTAGTTAACAGCTAAGCACCCTAATCAAC-TGGCTTCAATCTAAAGCCCCGGCAGG-TTTGAAGCTGCTTCTTCGAATTTGCAATTCAATATGAAAA-TCACCTCGGAGCTTGGTAAAAAGAGGCCTAACCCCTGTCTTTAGATTTACAGTCCAATGCTTCA-CTCAGCCATTTTACCACAAAAAAGGAAGGAATCGAACCCCCCAAAGCTGGTTTCAAGCCAACCCCATGGCCTCCATGACTTTTTCAAAAGGTATTAGAAAAACCATTTCATAACTTTGTCAAAGTTAAATTATAGGCT-AAATCCTATATATCTTA-CACTGTAAAGCTAACTTAGCATTAACCTTTTAAGTTAAAGATTAAGAGAACCAACACCTCTTTACAGTGA
        </sequence>
        <sequence taxon="chimp">
            AGAAATATGTCTGATAAAAGAATTACTTTGATAGAGTAAATAATAGGAGTTCAAATCCCCTTATTTCTACTAGGACTATAAGAATCGAACTCATCCCTGAGAATCCAAAATTCTCCGTGCCACCTATCACACCCCATCCTAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTTACACCCTTCCCGTACTAAGAAATTTAGGTTAAGCACAGACCAAGAGCCTTCAAAGCCCTCAGCAAGTTA-CAATACTTAATTTCTGTAAGGACTGCAAAACCCCACTCTGCATCAACTGAACGCAAATCAGCCACTTTAATTAAGCTAAGCCCTTCTAGATTAATGGGACTTAAACCCACAAACATTTAGTTAACAGCTAAACACCCTAATCAAC-TGGCTTCAATCTAAAGCCCCGGCAGG-TTTGAAGCTGCTTCTTCGAATTTGCAATTCAATATGAAAA-TCACCTCAGAGCTTGGTAAAAAGAGGCTTAACCCCTGTCTTTAGATTTACAGTCCAATGCTTCA-CTCAGCCATTTTACCACAAAAAAGGAAGGAATCGAACCCCCTAAAGCTGGTTTCAAGCCAACCCCATGACCTCCATGACTTTTTCAAAAGATATTAGAAAAACTATTTCATAACTTTGTCAAAGTTAAATTACAGGTT-AACCCCCGTATATCTTA-CACTGTAAAGCTAACCTAGCATTAACCTTTTAAGTTAAAGATTAAGAGGACCGACACCTCTTTACAGTGA
        </sequence>
        <sequence taxon="bonobo">
            AGAAATATGTCTGATAAAAGAATTACTTTGATAGAGTAAATAATAGGAGTTTAAATCCCCTTATTTCTACTAGGACTATGAGAGTCGAACCCATCCCTGAGAATCCAAAATTCTCCGTGCCACCTATCACACCCCATCCTAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTTATACCCTTCCCGTACTAAGAAATTTAGGTTAAACACAGACCAAGAGCCTTCAAAGCTCTCAGTAAGTTA-CAATACTTAATTTCTGTAAGGACTGCAAAACCCCACTCTGCATCAACTGAACGCAAATCAGCCACTTTAATTAAGCTAAGCCCTTCTAGATTAATGGGACTTAAACCCACAAACATTTAGTTAACAGCTAAACACCCTAATCAGC-TGGCTTCAATCTAAAGCCCCGGCAGG-TTTGAAGCTGCTTCTTTGAATTTGCAATTCAATATGAAAA-TCACCTCAGAGCTTGGTAAAAAGAGGCTTAACCCCTGTCTTTAGATTTACAGTCCAATGCTTCA-CTCAGCCATTTTACCACAAAAAAGGAAGGAATCGAACCCCCTAAAGCTGGTTTCAAGCCAACCCCATGACCCCCATGACTTTTTCAAAAGATATTAGAAAAACTATTTCATAACTTTGTCAAAGTTAAATTACAGGTT-AAACCCCGTATATCTTA-CACTGTAAAGCTAACCTAGCATTAACCTTTTAAGTTAAAGATTAAGAGGACCAACACCTCTTTACAGTGA
        </sequence>
        <sequence taxon="gorilla">
            AGAAATATGTCTGATAAAAGAGTTACTTTGATAGAGTAAATAATAGAGGTTTAAACCCCCTTATTTCTACTAGGACTATGAGAATTGAACCCATCCCTGAGAATCCAAAATTCTCCGTGCCACCTGTCACACCCCATCCTAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTCACATCCTTCCCGTACTAAGAAATTTAGGTTAAACATAGACCAAGAGCCTTCAAAGCCCTTAGTAAGTTA-CAACACTTAATTTCTGTAAGGACTGCAAAACCCTACTCTGCATCAACTGAACGCAAATCAGCCACTTTAATTAAGCTAAGCCCTTCTAGATCAATGGGACTCAAACCCACAAACATTTAGTTAACAGCTAAACACCCTAGTCAAC-TGGCTTCAATCTAAAGCCCCGGCAGG-TTTGAAGCTGCTTCTTCGAATTTGCAATTCAATATGAAAT-TCACCTCGGAGCTTGGTAAAAAGAGGCCCAGCCTCTGTCTTTAGATTTACAGTCCAATGCCTTA-CTCAGCCATTTTACCACAAAAAAGGAAGGAATCGAACCCCCCAAAGCTGGTTTCAAGCCAACCCCATGACCTTCATGACTTTTTCAAAAGATATTAGAAAAACTATTTCATAACTTTGTCAAGGTTAAATTACGGGTT-AAACCCCGTATATCTTA-CACTGTAAAGCTAACCTAGCGTTAACCTTTTAAGTTAAAGATTAAGAGTATCGGCACCTCTTTGCAGTGA
        </sequence>
        <sequence taxon="orangutan">
            AGAAATATGTCTGACAAAAGAGTTACTTTGATAGAGTAAAAAATAGAGGTCTAAATCCCCTTATTTCTACTAGGACTATGGGAATTGAACCCACCCCTGAGAATCCAAAATTCTCCGTGCCACCCATCACACCCCATCCTAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTTACACCCTTCCCGTACTAAGAAATTTAGGTTA--CACAGACCAAGAGCCTTCAAAGCCCTCAGCAAGTCA-CAGCACTTAATTTCTGTAAGGACTGCAAAACCCCACTTTGCATCAACTGAGCGCAAATCAGCCACTTTAATTAAGCTAAGCCCTCCTAGACCGATGGGACTTAAACCCACAAACATTTAGTTAACAGCTAAACACCCTAGTCAAT-TGGCTTCAGTCCAAAGCCCCGGCAGGCCTTAAAGCTGCTCCTTCGAATTTGCAATTCAACATGACAA-TCACCTCAGGGCTTGGTAAAAAGAGGTCTGACCCCTGTTCTTAGATTTACAGCCTAATGCCTTAACTCGGCCATTTTACCGCAAAAAAGGAAGGAATCGAACCTCCTAAAGCTGGTTTCAAGCCAACCCCATAACCCCCATGACTTTTTCAAAAGGTACTAGAAAAACCATTTCGTAACTTTGTCAAAGTTAAATTACAGGTC-AGACCCTGTGTATCTTA-CATTGCAAAGCTAACCTAGCATTAACCTTTTAAGTTAAAGACTAAGAGAACCAGCCTCTCTTTGCAATGA
        </sequence>
        <sequence taxon="siamang">
            AGAAATACGTCTGACGAAAGAGTTACTTTGATAGAGTAAATAACAGGGGTTTAAATCCCCTTATTTCTACTAGAACCATAGGAGTCGAACCCATCCTTGAGAATCCAAAACTCTCCGTGCCACCCGTCGCACCCTGTTCTAAGTAAGGTCAGCTAAATAAGCTATCGGGCCCATACCCCGAAAATGTTGGTTATACCCTTCCCATACTAAGAAATTTAGGTTAAACACAGACCAAGAGCCTTCAAAGCCCTCAGTAAGTTAACAAAACTTAATTTCTGCAAGGGCTGCAAAACCCTACTTTGCATCAACCGAACGCAAATCAGCCACTTTAATTAAGCTAAGCCCTTCTAGATCGATGGGACTTAAACCCATAAAAATTTAGTTAACAGCTAAACACCCTAAACAACCTGGCTTCAATCTAAAGCCCCGGCAGA-GTTGAAGCTGCTTCTTTGAACTTGCAATTCAACGTGAAAAATCACTTCGGAGCTTGGCAAAAAGAGGTTTCACCTCTGTCCTTAGATTTACAGTCTAATGCTTTA-CTCAGCCACTTTACCACAAAAAAGGAAGGAATCGAACCCTCTAAAACCGGTTTCAAGCCAGCCCCATAACCTTTATGACTTTTTCAAAAGATATTAGAAAAACTATTTCATAACTTTGTCAAAGTTAAATCACAGGTCCAAACCCCGTATATCTTATCACTGTAGAGCTAGACCAGCATTAACCTTTTAAGTTAAAGACTAAGAGAACTACCGCCTCTTTACAGTGA
        </sequence>
    </data>

    <tree id="tree" spec="beast.base.evolution.tree.Tree">
        <trait spec="beast.base.evolution.tree.TraitSet" traitname="date" units="year" value="human = 82, chimp = 83, bonobo = 94, gorilla = 76, orangutan = 77, siamang = 84">
            <taxa spec="TaxonSet" id="taxa" alignment="@alignment"/>
        </trait>
        <taxonset spec="TaxonSet" idref="taxa"/>
    </tree>

    <BirthDeathSkylineModel spec="bdsky.evolution.speciation.BirthDeathSkylineModel" id="birthDeath" tree="@tree">
        <parameter name="origin" id="origin" value ="150" lower="100" upper="200"/>

        <!-- dimension of reproductiveNumber, becomeUninfectiousRate and samplingProportion must be either equal to intervalNumber, or 1 (i.e. rate is constant over time) -->
        <parameter name="reproductiveNumber" id="reproductiveNumber" value="2" lower="0." dimension ="10"/>
        <parameter name="becomeUninfectiousRate" id="becomeUninfectiousRate" value="1." lower="0." dimension ="10"/>
        <parameter name="samplingProportion" id="samplingProportion" value="0.01" lower="0." upper="1." dimension ="10"/> 
    </BirthDeathSkylineModel>

    <siteModel gammaCategoryCount="4" id="SiteModel" proportionInvariant="@proportionInvariant" spec="SiteModel">
        <parameter dimension="1"  id="mutationRate" name="mutationRate" value="1.0"/>
        <parameter dimension="1" id="gammaShape" name="shape" value="0.5" lower="0.0" upper="1000.0"/>
        <substModel id="gtr" rateAC="@rateAC" rateGT="@rateGT" rateAT="@rateAT" rateCG="@rateCG" rateCT="@rateCT" spec="GTR">
            <parameter dimension="1"  id="rateAG" lower="0.0" name="rateAG" value="1.0"/>
            <frequencies estimate="true" id="freqs" spec="Frequencies">
                <parameter name="frequencies" id="freqParameter" value="0.25" dimension="4" lower="0." upper="1."/>
            </frequencies>
        </substModel>
    </siteModel>
    <branchRateModel id="StrictClock" spec="beast.base.evolution.branchratemodel.StrictClockModel">
        <parameter dimension="1" id="clock.rate" name="clock.rate" value="0.1" lower="0." upper="100."/> 
    </branchRateModel>

    <distribution id="posterior" spec="beast.base.inference.CompoundDistribution">
        <distribution id="prior" spec="beast.base.inference.CompoundDistribution">
            <distribution id="BDlikelihood" spec="beast.base.inference.CompoundDistribution">
                <distribution idref="birthDeath"/>      
            </distribution>
            <distribution  id="origin_Prior" x="@origin" spec="beast.base.inference.distribution.Prior">
                <distr spec="beast.base.inference.distribution.Uniform" lower="100" upper="200"/>
            </distribution>

            <!-- this prior distribution corresponds to a sampling proportion of about 40% -->
            <distribution id="samplingProportion_prior" spec="beast.base.inference.distribution.Prior" x="@samplingProportion">
                <distr spec="beast.base.inference.distribution.Beta" alpha="2." beta="3." offset="0."/>
            </distribution>
            <distribution  id="reproductiveNumber_Prior" x="@reproductiveNumber" spec="beast.base.inference.distribution.Prior">
                <distr spec="beast.base.inference.distribution.LogNormalDistributionModel" M="0" S="1.25"/> 
            </distribution>
            <distribution  id="becomeUninfectiousRate_Prior" x="@becomeUninfectiousRate" spec="beast.base.inference.distribution.Prior">
                <distr spec="beast.base.inference.distribution.Uniform" lower="0." upper="100."/>   
            </distribution>

            <distribution id="date.human" spec="beast.base.evolution.tree.MRCAPrior" tipsonly="true" tree="@tree">
                <taxonset id="taxonSet.human" spec="TaxonSet">
                    <taxon id="human" spec="Taxon"/>
                </taxonset>
                <distr spec="distribution.Uniform" lower="80" upper="84"/>
            </distribution>

            <distribution id="date.chimp" spec="beast.base.evolution.tree.MRCAPrior" tipsonly="true" tree="@tree">
                <taxonset id="taxonSet.chimp" spec="TaxonSet">
                    <taxon id="chimp" spec="Taxon"/>
                </taxonset>
                <distr spec="distribution.Uniform" lower="70" upper="96"/>
            </distribution>

            <distribution id="date.bonobo" spec="beast.base.evolution.tree.MRCAPrior" tipsonly="true" tree="@tree">
                <taxonset id="taxonSet.bonobo" spec="TaxonSet">
                    <taxon id="bonobo" spec="Taxon"/>
                </taxonset>
                <distr spec="distribution.Uniform" lower="90" upper="98"/>
            </distribution>
        </distribution>
        <distribution id="jointTreeLikelihood" spec="beast.base.inference.CompoundDistribution">      
            <distribution data="@alignment" id="treeLikelihood" spec="TreeLikelihood" tree="@tree" siteModel="@SiteModel" branchRateModel="@StrictClock"/>           
        </distribution>     
    </distribution>

    <RPNcalculator spec="beast.base.inference.util.RPNcalculator" id="birth" expression="reproductiveNumber becomeUninfectiousRate *"> <!-- s/(d+s) -->
        <parameter idref="becomeUninfectiousRate"/>
        <parameter idref="reproductiveNumber"/>        
    </RPNcalculator>
    <RPNcalculator spec="beast.base.inference.util.RPNcalculator" id="sampling" expression="becomeUninfectiousRate samplingProportion *"> 
        <parameter idref="becomeUninfectiousRate"/>
        <parameter idref="samplingProportion"/>        
    </RPNcalculator>
    <RPNcalculator spec="beast.base.inference.util.RPNcalculator" id="death" expression="becomeUninfectiousRate 1 samplingProportion - *"> <!-- b*S0/(d+s) -->
        <parameter idref="becomeUninfectiousRate"/>
        <parameter idref="samplingProportion"/>        
    </RPNcalculator>

    <run chainLength="200000000" id="mcmc" spec="MCMC" storeEvery="1000" distribution="@posterior">

        <state id="state" storeEvery="100000">
            <stateNode idref="tree"/>
            <stateNode idref="origin"/>
            <stateNode idref="becomeUninfectiousRate"/>
            <stateNode idref="samplingProportion"/>
            <stateNode idref="reproductiveNumber"/>

            <stateNode idref="clock.rate"/>     
            <stateNode idref="freqParameter"/>      
            <stateNode idref="gammaShape"/>
            <parameter dimension="1"  id="rateAC" lower="0.0" upper="100.0" name="stateNode" value="1.0"/>
            <parameter dimension="1"  id="rateGT" lower="0.0" upper="100.0" name="stateNode" value="1.0"/>
            <parameter dimension="1"  id="rateAT" lower="0.0" upper="100.0" name="stateNode" value="1.0"/>
            <parameter dimension="1"  id="rateCG" lower="0.0" upper="100.0" name="stateNode" value="1.0"/>
            <parameter dimension="1"  id="rateCT" lower="0.0" upper="100.0" name="stateNode" value="1.0"/>
            <parameter dimension="1"  id="proportionInvariant" name="stateNode" value="0.1"/>   
        </state>

        <!-- parameter weights for RealParameters roughly equal to parameter dimension --> 
        <operator id="becomeUninfectiousRate_scaler" spec="ScaleOperator" scaleFactor=".75" weight="10" parameter="@becomeUninfectiousRate"/>
        <operator id="sampling_scaler" spec="ScaleOperator" scaleFactor=".75" weight="10" parameter="@samplingProportion"/>
        <operator id="reproductiveNumber_scaler" spec="ScaleOperator" scaleFactor=".75" weight="10" parameter="@reproductiveNumber"/>

        <operator id="updown" spec="UpDownOperator" scaleFactor=".75" weight="10">
            <up idref="reproductiveNumber"/>
            <down idref="becomeUninfectiousRate"/>
        </operator>

        <operator id="orig_scaler" spec="ScaleOperator" scaleFactor=".75" weight="1" parameter="@origin"/>

        <operator id="tree_updown" spec="UpDownOperator" scaleFactor=".75" weight="10">
            <up idref="tree"/>
            <down idref="clock.rate"/>
        </operator>
        <operator id="clock.rate_Scaler" spec="ScaleOperator" scaleFactor=".75" weight="1" parameter="@clock.rate" />

        <operator id="treeScaler_root" spec="ScaleOperator" scaleFactor=".75" weight="1" tree="@tree" degreesOfFreedom="1" scaleAllIndependently="false" rootOnly="true"/>
        <operator id="treeScaler" spec="ScaleOperator" scaleFactor=".75" weight="20" tree="@tree"/>
        <operator spec="Uniform" weight="40" tree="@tree"/>
        <operator spec="SubtreeSlide" weight="20" gaussian="true" size="1." tree="@tree"/>
        <operator id="narrow2" spec="Exchange" isNarrow="true" weight="10" tree="@tree"/>
        <operator id="wide2" spec="Exchange" isNarrow="false" weight="10" tree="@tree"/>
        <operator spec="WilsonBalding" weight="10" tree="@tree"/>

        <operator spec="ScaleOperator" scaleFactor=".75" weight=".2" parameter="@rateAC"/>
        <operator spec="ScaleOperator" scaleFactor=".75" weight=".2" parameter="@rateAT"/>
        <operator spec="ScaleOperator" scaleFactor=".75" weight=".2" parameter="@rateCG"/>
        <operator spec="ScaleOperator" scaleFactor=".75" weight=".2" parameter="@rateCT"/>
        <operator spec="ScaleOperator" scaleFactor=".75" weight=".2" parameter="@rateGT"/>
        <operator spec="ScaleOperator" scaleFactor=".75" weight=".2" parameter="@proportionInvariant"/>
        <operator id="gammaShapeScaler" spec="ScaleOperator" scaleFactor=".75" weight=".2" parameter="@gammaShape"/>
        <operator autoOptimize="true" delta="0.2" id="FrequenciesExchanger" integer="false" spec="DeltaExchangeOperator" weight="0.1" parameter="@freqParameter"/>

        <operator spec="TipDatesRandomWalker" id="dateOperator.human" windowSize="1" taxonset="@taxonSet.human" tree="@tree" weight="1.0"/>
        <operator spec="TipDatesRandomWalker" id="dateOperator.chimp" windowSize="1" taxonset="@taxonSet.chimp" tree="@tree" weight="1.0"/>
        <operator spec="TipDatesRandomWalker" id="dateOperator.bonobo" windowSize="1" taxonset="@taxonSet.bonobo" tree="@tree" weight="1.0"/>

        <logger fileName="BDSKY_sequential_$(seed).log" id="tiplog" logEvery="1000" mode="autodetect" model="@posterior">
            <distribution idref="posterior" name="log"/>
            <log idref="BDlikelihood"/>
            <log idref="jointTreeLikelihood"/>
            <log id="TreeHeight" spec="beast.base.evolution.tree.TreeHeightLogger" tree="@tree"/>
            <log idref="origin"/>

            <log idref="gammaShape"/>
            <log idref="freqParameter"/>
            <log idref="rateAC"/>
            <log idref="rateAG"/>
            <log idref="rateAT"/>
            <log idref="rateCG"/>
            <log idref="rateCT"/>
            <log idref="rateGT"/>
            <log idref="proportionInvariant"/>

            <log idref="reproductiveNumber"/>
            <log idref="becomeUninfectiousRate"/>
            <log idref="samplingProportion"/>

            <log idref="birth"/>
            <log idref="death"/>
            <log idref="sampling"/>

            <log idref="date.human"/>
            <log idref="date.chimp"/>
            <log idref="date.bonobo"/>
        </logger>
        <logger id="screenlog" logEvery="10000" mode="autodetect">
            <distribution idref="posterior" name="log"/>
            <log arg="@posterior" id="ESS.0" spec="util.ESS"/>
            <log idref="BDlikelihood"/>
            <log idref="jointTreeLikelihood"/>
        </logger>
        <logger fileName="BDSKY_sequential_$(seed).trees" id="treelog" logEvery="10000" mode="tree">
            <log spec="beast.base.evolution.TreeWithMetaDataLogger" tree="@tree">
                <metadata idref="posterior"/>
                <branchratemodel idref="StrictClock"/>
            </log>
        </logger>
    </run>
</beast>

In the above example (augmented from https://github.com/CompEvol/beast2/blob/master/examples/testTipDates.xml), I have constrained the origin parameter to have a height between 100 and 200 (with a uniform prior in this range), I have treated the human, chimp, and bonobo taxa as having ranges of possible dates (with uniform MRCAPriors in these ranges for the tips and proposals made according to TipDatesRandomWalker operators), and I have treated the remaining taxa as having fixed dates (without priors or operators). As far as I can tell, BEAST is correctly reading the tip priors in terms of dates and the origin priors in terms of height, but if this is incorrect, feel free to let me know, and I can update the example above.

The problem is that when I plot the output distribution of trees as a DensiTree, the tips with fixed dates as well as the ones with variable dates appear to have their dates estimated:

bdsky_mrcaprior_densitree

Are there known issues like this when the BDSKY prior and tip date distributions are used in conjunction? (A potentially related issue is described in https://github.com/CompEvol/beast2/issues/981, but it was never resolved.) Or is the scattering of dates for fixed-date tips just an illusion resulting from the latest tip date being converted to height 0 in different trees? And in this latter case, is there a way to ensure that the DensiTree plot is aligned on a grid according to date rather than height? Ideally, the gorilla, orangutan, and siamang tips would be dated in all sampled trees to 76, 77, and 84, respectively.

rbouckaert commented 1 year ago

Hi Joey,

What seems to be happening is that because the trees in nexus format do not contain information about the youngest (= most recent) tip or the root height, DensiTree assumes the youngest tip is the baseline and puts every other node relative to the youngest tip based on branch length information (which is contained in the nexus file).

Usually, older tips, like fossils or ancient languages, have their tip date sampled, and the most recent tip is at a fixed date, causing no trouble in visualising the trees, but when the youngest tip is sampled the whole tree appears to be moving around. In summary, this is a visualisation issue in DensiTree (and other post-processing tools that rely on the nexus file only, like TreeAnnotator), not a BEAST or BDSKY issue.

A way around this could be to add a taxon that is younger than the sampled tips, or fixate the youngest tip at a fixed date. Make sure date ranges for sampled tips do not allow younger dates than the fixated tip.

Hope this is suitable for your situation.

jjmccollum commented 1 year ago

@rbouckaert Thanks for the clarification! I modified the siamang tip date to be fixed at 100, making it the youngest, and that seems to have fixed the DensiTree. Here is an updated plot for a distribution of trees generated from the updated XML file:

bdsky_mrcaprior_densitree_fixed

Your suggestion of adding a youngest tip at a fixed date makes sense, but is there a way to do this without letting the extra tip affect the rest of the model? If not, then it may be best to follow your second suggestion of fixing the youngest tip at a constant date.

It looks like this explains the issue I was seeing, so I will go ahead and close the issue. Thanks again!

rbouckaert commented 1 year ago

@jjmccollum Adding a dummy tip may impact tree prior parameters, and thus potentially age estimates, but how much depends on how long the branch into the dummy tip becomes, how many other taxa there are and how much signal there is in the data. So, if fixating an existing tip is a viable option, that would certainly be preferable.