Open TheoBerglin opened 5 years ago
Ok, let's note this in the comments to the code. Mite, what do you think about this?
I think that this issue is resolved in the dependency section of the algorithm. Namely, the dependency as expressed in the code is meant to quantify the case when one or more node has an equal path length with the others (as stated in theorem 6 in the original paper). Therefore, I think the code is correct and we should leave it as it is.
Okey, that's great! I think the issue was more my knowledge about the path length calculations. When I wrote this error down in the test file I thought that the "cost"/"length" of an edge was equal to the weight and not the invert of the weight. I think it should work aswell, I'll create a graph tomorrow to check against.
@mitemijalkov I tested with another graph and didn't get the correct result, even with the knowledge of inverted weights. I think the algorithm isn't calculating the number of shortest paths correctly in the case o weighted graphs. This is due to
Duw = D(v)+G1(v,w); % path length to be tested
if Duw<D(w) % if new u->w shorter than old
Where only if the new path length is shorter it's saved, not equally short. I'll have a look at it more. I tested it with the matrix
A9 = [1 2 0 2 0; 2 1 4 1 0; 0 4 1 0 0; 2 1 0 1 4; 0 0 0 4 1];
Which should yield
exp_res9 = [2/5 5/8 0 5/8 0];
But instead got the expected result
[1/3 1/2 0 1/2 0]
For node 2 and 4 the reason is that we only calculate the number shortest path as one and that this one goes through node 2 and 4 for path 3->4 and 3->5 for node 2 and 2->5 and 3->5 for node 4.
As for the incorrect answer of node 1 I don't have a clear idea why it doesn't work.
Is this measure important and should we seek out a solution to this problem or leave it? I think we'll rarely have a weighted matrix for which 2 paths are equally short
The changed test file is available at the branch "121-betweenness-bug"
Can I ask how did you get the expected result? The reason I ask is I did it manually for this matrix and calculated the number of shortest paths each node lies on. The results for nodes 2 and 4 were correct (6 paths each) but the algorithm is returning value of 2 for node 1 (and you also have a nonzero value for it), while I could not find any shortest paths that pass through 1 by inspection? Wondering what did I miss? The 1/2 fraction is obtained by calculating 6 paths for nodes 2 and 4 and dividing it by the number of all possible paths (excluding the node itself), which in this case is 4*3=12 (so this result from the algorithm is correct. But I am still not sure about how node 1 is nonzero.
The normalization is not crucial (as one would be generally more interested in the relative betweeness between nodes rather that the absolute one) but it seems that there should be a division by 2 in the case of undirected networks.
In the case of undirected graphs I only calculate half of your connections as the other way around will yield the same result, ie I only calculate 2->3 and not 2->3 and 3->2.
For example for shortest path between node 3 and 4 the path 3->2->1->4 is equally short as 3->2->4 due to the weights of the edges. In this case, we have 2 shortest paths from node 3 to node 4. One of the two shortest paths pass through node 1. Both of the shortest paths pass through node 2 though. I've uploaded an image of my calculations to the branch 121-betweenness-bug in the util folder of graph.
If it's not important then I think we should leave this until we've done the randomization.
Hi, I agree lets leave it for now and finish the randomization first. Then, we have a skype meeting where we agree on what to do with it and we mention everything that we need to mention in the code/manual so that people know what they are calculating.
Hi,
That sounds great!
OK, please keep this in mind (I think this is what the "bug" tags implies).
This "bug" concerns Weighted graphs.
If a graph have multiple paths of the same length, these don't count correctly in the sense that we don't calculate these as multiple possible paths but only as one. This means that only one of the shortest paths contribute to the number of possible paths.
See the test file for matrix A41 for an example.
As we have the same algorithm as bctnet they don't handle this either.