lmaurits / BEASTling

A linguistics-focussed command line tool for generating BEAST XML files.
BSD 2-Clause "Simplified" License
20 stars 6 forks source link

Something wrong with monophyly #185

Closed Anaphory closed 6 years ago

Anaphory commented 6 years ago

There is something seriously wrong with specifying a monophyly newick tree. I have the following configuration (using the language_group calibration code in my recent pull request, but that's hardly the issue here).

[admin]
basename = tapandan
log_params = True
[MCMC]
chainlength = 10000000
[languages]
monophyly_newick = clades.nex
[language_groups]
alorese = alor1247-baran, alor1247-munas, …
…
[model vocabulary]
data = ../cldf/Wordlist-metadata.json
model = covarion
file_format = cldf
reconstruct_at = alorese, lamaholot
reconstruct = *
rate_variation = True
[calibration]
p-aust1307-abvd = 4500-5000
…

The monophyly_newick tree is the following, in particular it has a clade (buna1278-bobon,buna1278-suai,buna1278-malia),.

((((((alor1247-baran,alor1247-besar,alor1247-pandai,alor1247-munas,lama1277-adona,lama1277-baipi,lama1277-bama,lama1277-belan,lama1277-botun,lama1277-dulhi,lama1277-horow,lama1277-ileap,lama1277-imulo,lama1277-kalik,lama1277-kiwan,lama1277-lamah,lama1277-lamak,lama1277-lamal,lama1277-lamat,lama1277-lewoe,lama1277-lewok,lama1277-lewog,lama1277-lewom,lama1277-lewop,lama1277-lewot,lama1277-lewob,lama1277-lewuk,lama1277-merde,lama1277-minga,lama1277-mulan,lama1277-paina,lama1277-pukau,lama1277-ritae,lama1277-tanju,lama1277-waiba,lama1277-waiwa,lama1277-watan,lama1277-wuake,lama1277-lewoi,lama1277-lerek),keda1252-leuba,keda1252-leuwa,keda1252,sika1262-hewa,sika1262-maume,sika1262-tanai),idat1237,kema1243,laka1255,mamb1306,tetu1246,tetu1245-suai,tetu1245-vique,tuku1254,amar1273-kotos,p-cent2245-abvd),pmpacd,p-mala1545-abvd),p-aust1307-abvd),(((atim1239,abui1241-fuime,abui1241-petle,abui1241-takal,abui1241-ulaga),adan1251-otvai,adan1251-lawah,baka1276,(blag1240-bama,blag1240-kulij,blag1240-nule,pura1258,blag1240-tuntu,blag1240-warsa),dein1238,hama1240,kabo1247,kaer1234,kafo1240,kama1365,kira1248,kelo1247-bring,kelo1247-hopte,kuii1253,kula1280-lanto,nede1245,rett1240,alor1249-sar-a,alor1249-sar-n,sawi1256,teiw1235,kola1287,wers1238-marit,lamm1241-westp,p-alor1249),((buna1278-bobon,buna1278-suai,buna1278-malia),fata1247,maka1316,p-east2519),p-timo1261))

However, the trees I get out of the Beast run (see below) do not have this clade. The starting tree contains ((buna1278-bobon:77.8718,fata1247:77.8718):723.378,buna1278-suai:801.2497) and the final Bunaq dialect has landed in the wrong family half of the tree.

(((abui1241-fuime:2418.4087,(((((((((abui1241-petle:14.6844,abui1241-takal:14.6844):13.8815,(kelo1247-bring:11.2746,kula1280-lanto:11.2746):17.2914):3.9994,blag1240-kulij:32.5654):32.1954,abui1241-ulaga:64.7608):22.0506,lamm1241-westp:86.8114):4.8289,((baka1276:33.8527,kuii1253:33.8527):41.0324,pura1258:74.8851):16.7552):63.0233,(alor1249-sar-a:36.6696,kaer1234:36.6696):117.994):210.1118,(((((adan1251-lawah:78.1926,(((blag1240-nule:0.035,kira1248:0.035):55.7368,sawi1256:55.7718):2.8385,blag1240-tuntu:58.6103):19.5822):31.8098,((alor1249-sar-n:20.5341,p-alor1249:20.5341):64.5452,(blag1240-warsa:37.1612,rett1240:37.1612):47.9181):24.923):14.5475,hama1240:124.5498):35.417,kola1287:159.9669):93.2266,((adan1251-otvai:4.8259,atim1239:4.8259):247.5071,(kelo1247-hopte:5.0113,teiw1235:5.0113):247.3218):0.8604):111.582):260.4256,(((blag1240-bama:26.0008,(dein1238:23.1157,kama1365:23.1157):2.8851):14.4539,(kafo1240:24.9595,nede1245:24.9595):15.4952):182.8183,(kabo1247:24.7416,wers1238-marit:24.7416):198.5314):401.928):1793.2077):1970.704,(((((buna1278-bobon:77.8718,fata1247:77.8718):723.378,buna1278-suai:801.2497):198.3283,maka1316:999.578):99.2289,p-east2519:1098.8069):134.125,p-timo1261:1232.9319):3156.1808):269.0386,((((((((((alor1247-baran:141.9751,alor1247-munas:141.9751):5.8392,alor1247-pandai:147.8143):29.7767,alor1247-besar:177.591):44.9869,(keda1252-leuba:24.2151,lama1277-bama:24.2151):198.3628):28.4143,(buna1278-malia:38.9122,((keda1252-leuwa:24.9561,(lama1277-adona:6.9778,sika1262-tanai:6.9778):17.9783):2.1796,(lama1277-dulhi:6.792,lama1277-merde:6.792):20.3438):11.7764):212.0801):23.6239,((((lama1277-baipi:2.1953,lama1277-botun:2.1953):19.9412,lama1277-horow:22.1365):36.2437,(((lama1277-lewog:19.8544,sika1262-hewa:19.8544):19.572,(lama1277-mulan:2.2733,lama1277-pukau:2.2733):37.153):14.5288,pmpacd:53.9552):4.4249):181.7324,lama1277-lewoi:240.1126):34.5035):424.7725,((((((keda1252:8.4543,(lama1277-lewuk:6.5178,lama1277-tanju:6.5178):1.9366):0.379,p-mala1545-abvd:8.8333):63.0507,((lama1277-ileap:18.6792,lama1277-watan:18.6792):1.664,(lama1277-lamat:1.0087,lama1277-waiwa:1.0087):19.3345):51.5409):7.4185,lama1277-lamal:79.3025):5.9759,((lama1277-lewob:25.0134,p-aust1307-abvd:25.0134):24.9483,lama1277-paina:49.9617):35.3168):295.7845,(lama1277-lamak:67.9937,(lama1277-lewom:39.7107,lama1277-lewot:39.7107):28.283):313.0692):318.3257):288.9403,((((lama1277-belan:4.4863,sika1262-maume:4.4863):34.2233,(lama1277-lewok:0.61,p-cent2245-abvd:0.61):38.0996):2.8678,(lama1277-kiwan:15.2953,lama1277-lewoe:15.2953):26.2821):92.111,(lama1277-lerek:16.5274,lama1277-ritae:16.5274):117.1609):854.6406):8.8631,(((((lama1277-imulo:1.5654,lama1277-wuake:1.5654):2.4048,lama1277-lamah:3.9702):71.8679,lama1277-lewop:75.8381):16.6301,lama1277-minga:92.4682):167.5019,(lama1277-kalik:142.671,lama1277-waiba:142.671):117.299):737.222):161.6317,(((((amar1273-kotos:47.0185,tetu1245-suai:47.0185):165.3926,(mamb1306:200.2507,(tetu1246:6.3335,tuku1254:6.3335):193.9171):12.1604):66.5251,(idat1237:65.6112,kema1243:65.6112):213.3251):581.1761,tetu1245-vique:860.1123):136.6011,laka1255:996.7135):162.1103):3499.3276);(((abui1241-fuime:1295.8585,((((((((abui1241-petle:1.2364,abui1241-takal:1.2364):4.1969,(blag1240-kulij:1.5239,kula1280-lanto:1.5239):3.9094):8.6022,kelo1247-bring:14.0355):7.7412,abui1241-ulaga:21.7767):18.0278,(baka1276:17.651,(kuii1253:4.4034,pura1258:4.4034):13.2476):22.1535):38.4499,((alor1249-sar-a:17.7172,lamm1241-westp:17.7172):17.0961,kaer1234:34.8133):43.4411):116.1091,(((blag1240-bama:1.2564,(dein1238:0.4261,kama1365:0.4261):0.8302):19.3516,(kabo1247:13.2571,wers1238-marit:13.2571):7.3508):99.134,(kafo1240:1.668,nede1245:1.668):118.0739):74.6215):113.9001,((adan1251-lawah:40.7424,((((alor1249-sar-n:8.3128,((blag1240-nule:0.0188,blag1240-tuntu:0.0188):1.2434,kira1248:1.2621):7.0507):2.6594,p-alor1249:10.3947):6.5894,sawi1256:17.5616):18.2241,(blag1240-warsa:16.3594,rett1240:16.3594):19.4264):4.9566):230.2176,((((adan1251-otvai:0.134,kelo1247-hopte:0.134):1.4409,atim1239:1.5749):3.8071,teiw1235:5.382):69.878,(hama1240:36.7708,kola1287:36.7708):38.4892):195.7):37.3036):987.595):52.9835,((((buna1278-bobon:110.2913,buna1278-suai:110.2913):425.3061,(fata1247:8.8879,maka1316:8.8879):526.7096):49.5372,p-east2519:577.4668):74.1609,p-timo1261:652.7542):689.5464):142.5645,(((((((alor1247-baran:26.4941,alor1247-munas:26.4941):51.9038,alor1247-pandai:78.3979):16.8692,alor1247-besar:95.2672):6.6198,(keda1252-leuba:13.1583,lama1277-bama:13.1583):88.7286):427.683,(((((((buna1278-malia:0.6729,(keda1252-leuwa:0.1578,lama1277-dulhi:0.1578):0.5151):11.857,(lama1277-lewog:0.9206,(lama1277-mulan:0.8017,lama1277-pukau:0.8017):0.1189):11.6094):26.9621,(lama1277-adona:0.0232,sika1262-tanai:0.0232):39.4688):3.1084,(lama1277-lewoi:27.0577,(lama1277-merde:25.5956,(pmpacd:17.0505,sika1262-hewa:17.0505):8.5451):1.4621):15.5428):30.7584,((lama1277-baipi:0.383,(lama1277-botun:0.3242,lama1277-horow:0.3242):0.0588):44.6415,((((lama1277-belan:1.585,lama1277-kiwan:1.585):1.4226,(lama1277-lewok:0.3269,p-cent2245-abvd:0.0022):2.6807):6.8263,sika1262-maume:9.8339):10.7182,lama1277-lewoe:20.5521):24.4724):28.3343):57.1007,(lama1277-lerek:3.5243,lama1277-ritae:3.5243):126.9352):31.4559,(((((((((((keda1252:0.4869,lama1277-tanju:0.4869):3.1491,p-mala1545-abvd:0.0174):1.0971,lama1277-lewuk:4.7331):0.4882,lama1277-waiwa:5.2213):3.6345,lama1277-lamat:8.8559):5.3516,(lama1277-watan:0.5405,lama1277-wuake:0.5405):13.667):2.3772,(lama1277-lewob:5.9535,p-aust1307-abvd:1.9425):10.6312):11.611,lama1277-lamal:28.1957):1.0103,lama1277-lamak:29.206):3.8149,lama1277-paina:33.0209):69.5466,(lama1277-lewom:13.2346,lama1277-lewot:13.2346):89.333):59.3479):367.6545):3.8351,((((amar1273-kotos:25.1936,tetu1245-suai:25.1936):106.8039,(((kema1243:17.9974,tetu1246:17.9974):17.2251,(mamb1306:3.3937,tuku1254:3.3937):31.8289):76.8152,laka1255:112.0377):19.9599):42.1824,tetu1245-vique:174.18):321.8716,idat1237:496.0516):37.3535):86.4648,(lama1277-ileap:59.6447,((((lama1277-imulo:0.1267,(lama1277-lewop:0.0585,lama1277-minga:0.0585):0.0682):0.2028,lama1277-lamah:0.3295):0.4309,lama1277-kalik:0.7604):16.7153,lama1277-waiba:17.4757):42.169):560.2251):871.5366);

This does not change for subsequent samples.

Anaphory commented 6 years ago

The Beast XML file contains the tag

        <distribution id="timor_tapMRCA" monophyletic="true" spec="beast.math.distributions.MRCAPrior" tree="@Tree.t:beastlingTree">
          <taxonset id="timor_tap" spec="TaxonSet">
            <plate range="buna1278-bobon,buna1278-suai,fata1247,maka1316,p-east2519" var="language">
              <taxon idref="$(language)" />
            </plate>
          </taxonset>
          <Normal id="CalibrationDistribution.timor_tap" mean="3250.0" name="distr" offset="0.0" sigma="127.55102040816327" />
        </distribution>

which is also missing Maliana. The tag that should enforce monophyly of Bunaq is entirely missing.

Anaphory commented 6 years ago

Oops. I just forgot monophyly=True.

But:

The file I now generated gives the following error from Beast, where BEASTling should already complain (and this is possibly my programming fault or at least something need to consider for the #181 and #152 and #151 stuff):

Error 110 parsing the xml input file

validate and intialize error: 333: Don't know how to generate a Random Tree for taxon sets that intersect, but are not inclusive. Taxonset null and timor_tapMRCA

Error detected about here:
  <beast>
      <run id='mcmc' spec='MCMC'>
          <init id='startingTree' spec='beast.evolution.tree.ConstrainedRandomTree'>
Anaphory commented 6 years ago

Bah. The loss of Maliana was due to an uncaught typo. Added a raise to the Glottolog lookup.