achtman-lab / GrapeTree

GrapeTree is a fully interactive, tree visualization program, which supports facile manipulations of both tree layout and metadata. Click the first link to launch: https://achtman-lab.github.io/GrapeTree/MSTree_holder.html
https://genome.cshlp.org/content/28/9/1395
GNU General Public License v3.0
79 stars 26 forks source link

Strange behaviour with MSTreeV2 #82

Open LordRust opened 5 years ago

LordRust commented 5 years ago

Sometimes the MSTreeV2 algorithm does not give me the expected behaviour. The attached example are re-sequencing of a couple of isolates. The distances are quite small:

11
iso5-run1  0   5   6   7   5   10  9   5  1   7   16
iso2-run2  5   0   4   5   3   7   7   2  5   5   14
iso3-run1  6   4   0   3   3   5   4   1  5   3   10
iso4-run2  7   5   3   0   3   7   6   2  6   1   11
iso2-run1  5   3   3   3   0   7   6   2  4   2   12
iso1-run2  10  7   5   7   7   0   1   4  9   7   16
iso1-run1  9   7   4   6   6   1   0   3  8   6   15
iso3-run2  5   2   1   2   2   4   3   0  4   2   9
iso5-run2  1   5   5   6   4   9   8   4  0   6   15
iso4-run1  7   5   3   1   2   7   6   2  6   0   11
iso6       16  14  10  11  12  16  15  9  15  11  0

When using MSTree the gesulting graph looks like expected (MSTree.svg)

$ grapetree -m MSTree -p ST5210_problem.chew
(iso6:9,(iso5-run1:1,iso5-run2:0):4,(iso1-run2:1,iso1-run1:0):3,iso2-run2:2,(iso4-run1:1,iso4-run2:0):2,iso2-run1:2,iso3-run1:1,iso3-run2:0);

If I use the MSTreeV2 algo the isolates stop grouping together and I get a star topology with (in the circumstances) very long branches between the re-sequenced isolates. (MSTreeV2.svg)

$ grapetree -m MSTreeV2 -p ST5210_problem.chew
(iso6:15,iso1-run2:9,iso1-run1:8,iso4-run1:6,iso4-run2:6,iso2-run2:5,iso3-run1:5,iso2-run1:4,iso3-run2:4,iso5-run1:1,iso5-run2:0);

I am not sure if this is the expected behaviour or not, but it sure will fool me at a cluster analysis. Profile raw data and .svg files attached: ST5210_bugreport.zip