matsengrp / gctree

GCtree: phylogenetic inference of genotype-collapsed trees
https://matsengrp.github.io/gctree
GNU General Public License v3.0
16 stars 2 forks source link

Poisson context likelihood and better cli flexibility #126

Closed willdumm closed 6 months ago

willdumm commented 6 months ago

This PR optionally (and by default) replaces "mutability parsimony" with a poisson S5F context-based likelihood, described in https://github.com/matsengrp/poisson-subs-models/blob/main/main.tex This likelihood assumes MLE branch lengths, and doesn't require any parameter fitting on the parsimony forest.

The PR also adds two new command line arguments to gctree infer:

The CollapsedForest.filter_trees method was largely rewritten. It should be somewhat faster and provides much better log messages (if verbose flag is set) indicating how tree ranking is performed, which is reassuring when you want to be sure that the ranking coefficients line up with the proper weights. It also warns the user if the sign of a ranking coefficient doesn't match the appropriate optimization function for that weight (e.g. coefficients for likelihoods should be negative, since higher likelihoods are better)

More test cases are now included in tests/smalltest.sh, testing more of the possible cli parameter combinations. There is also a new test in tests/test_likelihoods.py comparing the new context-based likelihood to a couple of hand-computed simple cases.