FePhyFoFum / phyx

phylogenetics tools for linux (and other mostly posix compliant) computers
blackrim.org
GNU General Public License v3.0
111 stars 17 forks source link

pxnj and pxupgma tree inference #142

Open RhettRautsaw opened 3 years ago

RhettRautsaw commented 3 years ago

Running a neighbor joining tree on the same alignment in Geneious vs. phyx results in completely different trees. The tree in Geneious makes sense....the tree from pxnj is completely unexpected. Same with upgma.

josephwb commented 3 years ago

Can you please post input data and expected results?

As the wiki states the nj and upgma programs are not intended for publishable results, and the edge lengths are reported in numbers of substitutions (rather than the standard expected number of substitutions per site) because the author was recreating the original formulation.

Still, I'll take a look and see what the problem is.

jfwalker commented 3 years ago

Before diving in too much @josephwb , I may be able to answer this. When making a distance matrix this version treats indels as being a state themselves, which is not in any modern used software I'm aware of. When missing data exists that will cause a deviation from others. Also most of the newer neighbor joining tools will do a distance correction with something like JC etc. This version was mainly designed as a teaching tool to look into analyzing modern data using methods from the past, because UPGMA programs and the old fashion neighbor joining do not really exist anymore as far as I'm aware. Sorry for any confusion!

RhettRautsaw commented 3 years ago

Sounds like you know what is going on and don't really need my data. UPGMA and NJ have indeed fallen out of standard practice and I did not have any intention of publishing these trees, but they are still useful for quick examination and even identification of highly divergent sequences indicative of poor sequences in your alignment. Despite this, there are few programs available to actually make UPGMA and NJ trees other than GUIs...so I was excited to see your tool.

I'd love to see these tools updated to match standard NJ and UPGMA calculations (ignoring insertions/missing data) and performing distance correction like JC. Based on how weird my trees looked, I'm not sure these are even usable as teaching tools currently.

PS. I think overall phyx is a great tool and I plan on using it quite a bit.

josephwb commented 3 years ago

Please go ahead and upload input/expected output.

RhettRautsaw commented 3 years ago

In particular, sequences 66, 67, 44, 45, 94, and 95 are outgroups and should group together. phyx_help.zip

josephwb commented 3 years ago

I've removed these programs from the default build until/if they are fixed.