lmaurits / BEASTling

A linguistics-focussed command line tool for generating BEAST XML files.
BSD 2-Clause "Simplified" License
20 stars 6 forks source link

"The Yule model is not particularly suitable for linguistic analyses, however it is currently the best BEAST has to offer." #233

Open SimonGreenhill opened 5 years ago

SimonGreenhill commented 5 years ago

This is false. BEAST offers many other tree priors, many of which are more appropriate than Yule (e.g. BD contemp)

Anaphory commented 5 years ago

Fair! Do you know anything apart from Taraka Rama's paper that explicitly looks at tree priors suitable for linguistics?

I have Beastling code for (F?)BD priors and uniform priors, and I used it in our TAP phylogenetics paper. I think it's in here already. It may need testing, but it definitely needs that bit of the documentation updated. Maintaining up-to-date documentation is just so hard…

SimonGreenhill commented 5 years ago

Sure, If you assume Yule, you assume (a) complete sampling of all languages, (b) no extinction (=no dead tips). Both of which are dubious. There's plenty of work outside linguistics evaluating these, so no need for Taraka's paper. In principle one of the BD models is always more correct, probably the BD contemp, if not the FBD.

Anaphory commented 5 years ago

Yes, even deducing from first principles Yule is a special case of *BD priors with additional axioms that we should not assume. I didn't mean to deny that.

I was thinking that given that this means we do want to add tree priors and change defaults, do we have any evidence what our new default should be, and which tree priors are worth implementing? What would be the most appropriate prior?

lmaurits commented 5 years ago

I'm very open to changing the default tree prior if there is something currently implemented in BEAST which can be convincingly argued as better than Yule, and I'm inclined to agree that a BD model fits that bill. My original idea was always that BEASTling's default model decisions would be updated to track what seemed to be the consensus best practice as the field developed, so that people could just bring a dataset (ideally CLDF!) to the party and get something state-of-the-art or close to it (although the defaults do skew toward simplicity/minimalism as well - I'm not averse at all to us maintaining a fuller-flavoured "state of the art template" configuration that people can use by e.g. changing the data filename and adding some calibrations).

Gereon, it does look like your code for BirthDeathGernhard08Model is already in develop. I'm happy to help with testing and documentation if we want to make an effort to get this into the mainstream sooner rather than later? If we can also come up with a good solution to the CLDF problems I recently raised, it might even be worth cutting a new release sometime soonish? Most of the open 1.5.0 Issues are "only" test related...

SimonGreenhill commented 5 years ago

you should discuss this with Denise while you're in Jena.

lmaurits commented 5 years ago

Will do. As of yesterday, birth-death tree priors can now be used by specifying tree_prior = birthdeath in the [languages] section, so at least we are no-longer Yule only.

Anaphory commented 4 years ago

I have also just implemented a fossilized birth death prior, which would be a better prior for calibrated tips than coalescent, so I would also be happy to remove coalescent as fallback and only add it when really explicitly asked for. (That would definitely be a breach of API and require going to version 2.X.X.)