vlanore / diffsel

A C++ program to detect convergent evolution using Bayesian phylogenetic codon models.
Other
6 stars 2 forks source link

diffsel error: non matching number of taxa #2

Closed NatJWalker-Hale closed 5 years ago

NatJWalker-Hale commented 5 years ago

I'm having trouble getting diffsel to parse a tree of 198 sequences. It was produced by IQ-TREE and I subsequently processed the branch labeling in R using ape. When I run diffsel, I receive the error:

error : non matching number of taxa : 196   198
some taxa in the dataset are not present in the tree

I've attached the tree, with two random branches set to 1 and the rest to 0. Could you please help me figure out why this newick and alignment isn't being successfully parsed?

Thanks in advance,

Nathanael

test.phy.txt test.tre.txt

vlanore commented 5 years ago

Hello,

Sorry for the delay. I just took some time to look at your problem and it seems that two taxon names are misspelled in your alignement: Cactaceae_Hylocereus_lemairei_polyrhizus_15461 and Chenopodiaceae_Chenopodium_giganteum_Chenopodium_amaranticolor_14731. Both have an = instead of a _ in their name in the alignment but not in the tree.

To fix this, just replace all = with _ in your alignment. With this fix, it works on my side.

NatJWalker-Hale commented 5 years ago

Thanks, that fixed it!

vlanore commented 5 years ago

Glad I could help! Marking this issue as closed.