iit-cs579 / main

CS579: Online Social Network Analysis at the Illinois Institute of Technology
147 stars 204 forks source link

[a1] norm_cut_score() ambiguous tie breaking rule #487

Closed ppttzhu closed 5 years ago

ppttzhu commented 5 years ago

sorted graph.pdf

In norm_cut_score test case max_depth=1, I cannot get 1.007. I think the difference is caused by tie breaking rule. HW instruction is ambiguous: break by edge name (e.g., (('A', 'B'), 1.0) comes before (('B', 'C'), 1.0)). It should be more clear such as: alphabetical order by first_node then second_node, given an edge (first_node, second_node).

If I use python3 default sort() function, the sequence I get is as attached.

aronwc commented 5 years ago

The instructions in partition_girvan_newman:

If there are ties in edge betweenness, break by edge name (e.g., (('A', 'B'), 1.0) comes before (('B', 'C'), 1.0)).

Yes, sorting by the edge names is sufficient. E.g.:

>>> sorted([('A', 'C'), ('B', 'A'), ('A', 'B')])
[('A', 'B'), ('A', 'C'), ('B', 'A')]

Note that when you sort tuples, it first sorts by the first element of each tuple, then the second, etc.

ppttzhu commented 5 years ago

Great, that's helpful.

The instructions in partition_girvan_newman:

If there are ties in edge betweenness, break by edge name (e.g., (('A', 'B'), 1.0) comes before (('B', 'C'), 1.0)).

Yes, sorting by the edge names is sufficient. E.g.:

>>> sorted([('A', 'C'), ('B', 'A'), ('A', 'B')])
[('A', 'B'), ('A', 'C'), ('B', 'A')]

Note that when you sort tuples, it first sorts by the first element of each tuple, then the second, etc.