TheoBerglin / TutorialGIT

An example repository for showing the power of GIT
0 stars 0 forks source link

Betweenness multiple shortest paths (Weighted) #121

Open TheoBerglin opened 5 years ago

TheoBerglin commented 5 years ago

This "bug" concerns Weighted graphs.

If a graph have multiple paths of the same length, these don't count correctly in the sense that we don't calculate these as multiple possible paths but only as one. This means that only one of the shortest paths contribute to the number of possible paths.

See the test file for matrix A41 for an example.

As we have the same algorithm as bctnet they don't handle this either.

giovannivolpe commented 5 years ago

Ok, let's note this in the comments to the code. Mite, what do you think about this?

mitemijalkov commented 5 years ago

I think that this issue is resolved in the dependency section of the algorithm. Namely, the dependency as expressed in the code is meant to quantify the case when one or more node has an equal path length with the others (as stated in theorem 6 in the original paper). Therefore, I think the code is correct and we should leave it as it is.

TheoBerglin commented 5 years ago

Okey, that's great! I think the issue was more my knowledge about the path length calculations. When I wrote this error down in the test file I thought that the "cost"/"length" of an edge was equal to the weight and not the invert of the weight. I think it should work aswell, I'll create a graph tomorrow to check against.

TheoBerglin commented 5 years ago

@mitemijalkov I tested with another graph and didn't get the correct result, even with the knowledge of inverted weights. I think the algorithm isn't calculating the number of shortest paths correctly in the case o weighted graphs. This is due to

Duw = D(v)+G1(v,w); % path length to be tested if Duw<D(w) % if new u->w shorter than old

Where only if the new path length is shorter it's saved, not equally short. I'll have a look at it more. I tested it with the matrix

A9 = [1 2 0 2 0; 2 1 4 1 0; 0 4 1 0 0; 2 1 0 1 4; 0 0 0 4 1];

Which should yield

exp_res9 = [2/5 5/8 0 5/8 0];

But instead got the expected result

[1/3 1/2 0 1/2 0]

For node 2 and 4 the reason is that we only calculate the number shortest path as one and that this one goes through node 2 and 4 for path 3->4 and 3->5 for node 2 and 2->5 and 3->5 for node 4.

As for the incorrect answer of node 1 I don't have a clear idea why it doesn't work.

Is this measure important and should we seek out a solution to this problem or leave it? I think we'll rarely have a weighted matrix for which 2 paths are equally short

TheoBerglin commented 5 years ago

The changed test file is available at the branch "121-betweenness-bug"

mitemijalkov commented 5 years ago

Can I ask how did you get the expected result? The reason I ask is I did it manually for this matrix and calculated the number of shortest paths each node lies on. The results for nodes 2 and 4 were correct (6 paths each) but the algorithm is returning value of 2 for node 1 (and you also have a nonzero value for it), while I could not find any shortest paths that pass through 1 by inspection? Wondering what did I miss? The 1/2 fraction is obtained by calculating 6 paths for nodes 2 and 4 and dividing it by the number of all possible paths (excluding the node itself), which in this case is 4*3=12 (so this result from the algorithm is correct. But I am still not sure about how node 1 is nonzero.

mitemijalkov commented 5 years ago

The normalization is not crucial (as one would be generally more interested in the relative betweeness between nodes rather that the absolute one) but it seems that there should be a division by 2 in the case of undirected networks.

TheoBerglin commented 5 years ago

In the case of undirected graphs I only calculate half of your connections as the other way around will yield the same result, ie I only calculate 2->3 and not 2->3 and 3->2.

For example for shortest path between node 3 and 4 the path 3->2->1->4 is equally short as 3->2->4 due to the weights of the edges. In this case, we have 2 shortest paths from node 3 to node 4. One of the two shortest paths pass through node 1. Both of the shortest paths pass through node 2 though. I've uploaded an image of my calculations to the branch 121-betweenness-bug in the util folder of graph.

If it's not important then I think we should leave this until we've done the randomization.

mitemijalkov commented 5 years ago

Hi, I agree lets leave it for now and finish the randomization first. Then, we have a skype meeting where we agree on what to do with it and we mention everything that we need to mention in the code/manual so that people know what they are calculating.

TheoBerglin commented 5 years ago

Hi,

That sounds great!

giovannivolpe commented 5 years ago

OK, please keep this in mind (I think this is what the "bug" tags implies).