lmaurits / BEASTling

A linguistics-focussed command line tool for generating BEAST XML files.
BSD 2-Clause "Simplified" License
20 stars 6 forks source link

Better logging #53

Closed lmaurits closed 7 years ago

lmaurits commented 8 years ago

BEASTling should automatically log more details than it currently does, and logging policy should be driven by best practices for MCMC phylogenetics. For example, for every clade with a calibration prior, we should also log the age of that clade, so that when the analysis is run sampling from the prior, the interaction between the tree prior and the calibrations can be inspected.

We should log tree heights in all analyses where branch lengths are sampled.

If possible, random local clock analyses should log rate change locations, and if not possible we should write the Java code to make it possible.

lmaurits commented 8 years ago

All of the scalar things mentioned above are now logged. I'm not sure how to proceed with logging rate information for non-strict clocks. It's straightforward enough if there is only one such clock, but if there are multiple I don't yet know if there is a way to log them all on a single tree logfile. We could of course have one logfile per clock, but there will be quite some redundancy between the the files...

lmaurits commented 7 years ago

I think logging is now handled about as well as it can be without writing additional BEAST classes. It's not possible to log multiple clock rates on a single tree file, so the current setup produces one tree per non-strict clock. If there are multiple non-strict clocks, the filenames have the name of the clock model included in them to make them unambiguous, otherwise if there is only one we just use the standard basename.nex filename. Geography gets its own tree with locations logged in it (and branch rates if the geo clock is non-strict). We could be a little more disk-efficient by not using a separate geography tree for the location logging in cases where the geography model shares its non-strict clock with the data models, but this is likely to be a rare situation, so it's no great loss.