PoonLab / tn

Optimization of genetic clustering methods by predictive modeling
GNU General Public License v3.0
0 stars 0 forks source link

Fast-Branching Cluster Method Bugs #33

Closed ConnorChato closed 4 years ago

ConnorChato commented 4 years ago

Initial results seemed to only update the null AIC, while the model AIC remained static. This explains the dramatic AIC loss and "To good to be true" smoothness of the plot shown by initial test-runs. output1 . Interestingly, an AIC of 300 is still pretty low compared to the AICs of 500-600 that previous null models were obtaining.

ConnorChato commented 4 years ago

Nodes tend to represent points in the tree where each child subtree is close together in time (or mean time). This leads to extremely limited variation in time differences displayed by the tree - For Seattle, 97% of all nodes have child branches with mean collection dates within a year of eachother (mean Difference of 34 days). Given the distribution of collection dates in the Seattle data set, I'd still expect a skew towards 0, but a much more reasonable mean difference of about 3.7 years. I'm exploring strategies that incorporate the expected distribution of time differences (ie. If they didn't matter) compared to those shown in the resulting most-likely tree. denDist

ConnorChato commented 4 years ago

Hmmm - Okay, it looks like this may not necessarily just be the model, new sequences don't seem that likely to join recent subtrees compared to older subtrees. If anything, new cases have a slight preference for older subtrees.

187aa4f2-7262-4b18-9b93-d29bfa96acd6

I'm going to check for the correctness of time information, check the original tree and pplacer runs, and then re-run this stuff with the Tennessee Diagnostic dates. Once I process them, I'll do the Seattle Diagnostic dates too.

ConnorChato commented 4 years ago

Similar things going on with the diagnostic Tennessee Data set. Huge preference for nodes with children that are close together in time, but nothing too dramatic shows up after the tips are added with pplacer.

newIssuePic newIssuePic2

I'm going to look over how I'm calculating growth again and then maybe look into alternatives to pplacer. Pumper is less used, but I believe it's still seeing regular updates

ConnorChato commented 4 years ago

Fixed - there was an issue with how the dates ended up synching up in the tree-building process.

image