Closed Boarders closed 4 years ago
This issue appears to be that the output graph from the clustering function(s) is malformed.
This is the adjacency matrix for the graph that was produced from the clustering method, after the build(1, tree, cluster:(upgma, group:5))
command but before the build(1, network)
command:
node parent(s) children
0 {6} {1,2}
1 {0} {}
2 {0} {}
3 {4} {}
4 {25} {3,5}
5 {4} {}
6 {8} {0,7}
7 {6} {}
8 {10} {6,9}
9 {8} {}
10 {12} {8,11}
11 {10} {}
12 {14} {10,13}
13 {12} {}
14 {16} {12,15}
15 {14} {}
16 {20} {14,18}
17 {18} {}
18 {16} {17,19}
19 {18} {}
20 {25} {16,21}
21 {20} {}
22 {29} {}
23 {29} {}
24 {} {25,28}
25 {24} {26,29}
26 {25} {4,20}
27 {29} {}
28 {24} {}
29 {25} {27,31}
30 {31} {22,23}
31 {29} {30,32}
32 {31} {}
Additionally, here is a nice (but deceptive) rendering of the tree:
24
|
------------------------------
/ \
25 28
|
----------------------------
/ \
26 29
| |
-------------------- -------
/ \ / \
4 20 27 31
| | |
-- ----------------- -----
/ \ / \ / \
3 5 16 21 30 32
| |
--------------- ---
/ \ / \
14 18 22 23
| |
----------- ---
/ \ / \
12 15 17 19
|
---------
/ \
10 13
|
-------
/ \
8 11
|
-----
/ \
6 9
|
----
/ \
0 7
|
--
/ \
1 2
As you can see from the adjacency matrix, nodes 22
and 23
are leaf nodes which both list that node 29
is their parent. However node 29
lists nodes 27
and 31
as it's children. The referential integrity of the graph object is broken at this point, and the later attempt to perform re-rooting logic in the subsequent build command fails because it assumes referential integrity as an invariant that is maintained.
This issue seems to be that nodes 22
and 23
did not have their parents set correctly somewhere in the clustering code.
I created this branch for diagnosing and correcting the defect: clustering-defect
Ah interesting thanks for figuring it out. I think I have probably made an index error somewhere along the way. I can have a look at fixing this when I get a moment. I think this must then be different from the pcg-test-flu-net
case then as that was using an ordinary wagner build first.
Since nodes 22
and 23
list node 29
as their parent, but it appears that node 30
is their correct parent, it might be an off by one error.
It probably is a different issue than what is happening in #159.
That sounds correct to me. I would guess the error is located in the substitution function somewhere as that does a bunch of reasonably complicated fiddling with indices. I'll add this defect to the test suite for that code when I look into it.
This issue is (hopefully) fixed in this commit: https://github.com/amnh/PCG/commit/e3bea6797e95af26c54ba2c1ba316de5505c8cf2
When running the following script on the chel data set:
I get the error:
Running either build command without the other build command seems to work as expected.