Closed darencard closed 4 years ago
Hi Daren,
PhyloAcc can run with missing data. It treats characters other than acgtrykmsw in the input alignment file as missing data and assumes that the missing locus can be any character (i.e. acgt). If the missing sequence is too long for a species, it will increase the probability of acceleration in that species.
Best, zhirui
Thanks Zhirui, thats very helpful!
I'm not quite to the stage of running PhyloAcc yet, but in looking at the documentation, I see no mention of missing data. For example, at a given locus, it is possible that one of the species does not have sequence information due to missing alignments, assembly artifacts, etc. Is there a way to encode this so a user could still run PhyloAcc on a concatenated alignment? Or is it necessary to perform separate PhyloAcc runs in instances where one or more species do not have orthologous sequence data? Any guidance is greatly appreciated!
All the best, Daren Card