alexdonath / gappy2

gappy2 is the successor of gappy v1
GNU General Public License v3.0
3 stars 0 forks source link

Using Gappy with other tree builders? #3

Closed CCranney closed 2 years ago

CCranney commented 2 years ago

Hi,

This is more of a question rather than a bug, though I suppose it could turn into an enhancement. Do you know of anyone who has used gappy in conjunction with other phylogeny tree builders? The analysis I'm trying to complete would be best done when accounting for gaps as indels (what gappy does well) in addition to other changes, like mutations (what other programs do well). Have you seen anybody who has combined the two? This is an ongoing issue I summarized here in the NextStrain Github repository and here on a question on Stack Overflow. I found your program and it addresses the problem I'm grappling with, but only one half of it, if that makes sense.

alexdonath commented 2 years ago

Hi Caleb,

I am sorry, I don't know any tool that accounts for substitutions and InDels likewise.

The main problem here is the difficulty to implement substitution models that account for gaps. People have worked on this in the past (see, e.g. here and here). But with a focus on Bayesian tree inference. Maybe the provided links can act as a start for you to dig deeper.

As was mentioned in one of the links you have provided, usually, gaps are treated as unknown characters because it is unknown what would be there if something were there. Phyml accounts for gaps by summing up the likelihood over all possible states at these positions. IQTree basically ignores them. Have you thought about combining an alignment and the presence/absence matrix produced by gappy in a common analysis? IQTree allows you to do a partitioned analysis with mixed data. You could treat the presence/absence matrix as binary or morphological data.

CCranney commented 2 years ago

Thank you for directing me to IQTree's mixed data functionality, I've played with the data a bit and made a workflow that I think will be able to combine gappy2's indel considerations with IQTree's native substitution considerations. I wrote out my results on the stack overflow link above - take a look and let me know if you see any problems with how I'm using gappy2, but otherwise I'm going to close this issue. Thanks again!