BEAST2-Dev / BEASTLabs

A package for BEAST 2 implementing generally useful stuff
6 stars 4 forks source link

SimpleRandomTree builds mixed clades under monophyly constraints #8

Open Anaphory opened 5 years ago

Anaphory commented 5 years ago

In my beast.xml (attached), I have MRCA priors set up in a consistent manner, but due to the implementation of SimpleRandomTree, the initialization fails.

At its core, I have the following stack of MRCA priors.

    <distribution id="alor1247-besar_originateMRCA" monophyletic="true" spec="beast.math.distributions.MRCAPrior" tree="@Tree.t:beastlingTree" useOriginate="true">
      <taxonset id="tx_alor1247-besar" spec="TaxonSet">
        <taxon idref="alor1247-besar" />
      </taxonset>
      <Normal id="DistributionForalor1247-besar_originateMRCA" mean="450.0" name="distr" offset="0.0" sigma="25.5102040816" />
    </distribution>
    <distribution id="alorese_originateMRCA" monophyletic="true" spec="beast.math.distributions.MRCAPrior" tree="@Tree.t:beastlingTree" useOriginate="true">
      <taxonset id="alorese" spec="TaxonSet">
        <plate range="alor1247-baran,alor1247-besar,alor1247-munas,alor1247-pandai" var="language">
          <taxon idref="$(language)" />
        </plate>
      </taxonset>
      <Normal id="DistributionForalorese_originateMRCA" mean="675.0" name="distr" offset="0.0" sigma="12.7551020408" />
    </distribution>
    <distribution id="lamaholoticMRCA" monophyletic="true" spec="beast.math.distributions.MRCAPrior" tree="@Tree.t:beastlingTree">
      <taxonset id="lamaholotic" spec="TaxonSet">
        <plate range="alor1247-baran,alor1247-besar,alor1247-munas,alor1247-pandai,lama1277-adona,lama1277-baipi,lama1277-bama,lama1277-belan,lama1277-botun,lama1277-dulhi,lama1277-horow,lama1277-ileap,lama1277-imulo,lama1277-kalik,lama1277-kiwan,lama1277-lamah,lama1277-lamak,lama1277-lamal,lama1277-lamat,lama1277-lerek,lama1277-lewob,lama1277-lewoe,lama1277-lewog,lama1277-lewoi,lama1277-lewok,lama1277-lewom,lama1277-lewop,lama1277-lewot,lama1277-lewuk,lama1277-merde,lama1277-minga,lama1277-mulan,lama1277-paina,lama1277-pukau,lama1277-ritae,lama1277-tanju,lama1277-waiba,lama1277-waiwa,lama1277-watan,lama1277-wuake" var="language">
          <taxon idref="$(language)" />
        </plate>
      </taxonset>
      <Uniform id="DistributionForlamaholoticMRCA" lower="300.0" name="distr" offset="0.0" upper="9223372036854775807" />
    </distribution>

In the doTheWork() method of SimpleRandomTree, Alor-Besar ist first grouped with one of its Alorese siblings to ensure that the MRCA prior on its ancestor can be fulfilled. Both are then removed from the taxonsets that still need to be considered. When the turn comes op for the equivalent step for the Alorese taxonset, a similar thing happens to another Alorese sibling together with one Lamaholot taxon, but for some reason which still escapes my debugging, that grouping has priority over the Alorese-internal grouping, so the SimpleRandomTree ends up containing something like

(((alor1247-besar,alor1247-baran),(alor1247-munas,(alor1247-pandai,lama1277-lamah))),(lama…
Anaphory commented 5 years ago

(The XML file is big, because I have not built a minimal working example yet.) xml.zip