Open count0 opened 4 years ago
Thanks Tiago for letting us know. I also checked the original data at "http://www.boardsandgender.com/data.php" . We changed the node indices, since in the data we have provided here, we just considered the largest component. In the original dataset, the nodes with the following indices (73, 146, 5134) construct the same triangle!
Indeed, it's a problem with the original dataset! It seems that the two kinds of nodes (director and board) can have the same index, which is error prone.
Thanks for fixing.
(Note that by fixing the index it changes quite a bit the largest component of some of these networks, since they are very sparse.)
Interesting! I didn't know about that. Can you give me a reference? How changing the indices can change the largest component? Is it an algorithmic issue?
The repeated indexes cause the number of nodes to be smaller, since it merges different nodes together (thus destroying the bipartite property). Thus fixing this problem increases the number of nodes, while keeping the number of edges constant.
For example, for the data Norwegian_Board_of_Directors_net2mode_2009-10-01
, we have N=1332
nodes before fixing the indexes, and N=1729
afterwards, while E=1465
remains the same.
As a result of the sparsification, the number of components jumps from 1 to 278!
72 % of the bipartite graphs are not bipartite after the edge list given.
Yes, as Tiago pointed out before, all “Norwegian_Board_of_Directors_net2mode…” (111 networks — network ids = 254–364) and one network called “Aishihik_Lake_host-parasite_web_Aishihik_Lake_host-parasite_web” (network id = 0) have this indexing issue. Then for totally 112 networks out of 572 networks, their source had this issue. We fixed that in our new publication regarding optimal link prediction. I will attach the corrected version of networks soon.
Thank you! I am waiting the corrected version!
“Norwegian_Board_of_Directors_net2mode…” after the "graphProperties" column are the projected networks. What I pointed out is that after the "graphProperties" column, where the "Bipartite" property appears, 113 from 157 cases are not Bipartite after the indexing. I didn't checked the Projected graphs, but if this is true then 112+113 = 225 networks out of 572 has problem.
No, the projected graphs are not projected by us and they are from the original source. The issue is related to indexing in some of the bipartite graphs that I mentioned. The "Norwegian_Board_of_Directors_net2mode" family of networks and one network called “Aishihik_Lake_host-parasite_web_Aishihik_Lake_host-parasite_web” had this issue that are fixed and I will attach the corrected version very soon. Thanks!
Indeed. There are also the projected and the bipartite versions in the dataframe. Thank you!
Sorry about the delay. I updated the dataframe.
I noticed that most 2-mode "Norwegian Board of Directors", which are supposed to be bipartite, actually contains odd-length cycles. For example, in data
The following triangle exists:
Other data from this series do not contain triangles, but higher odd-length cycles do exist. Only a minority of them are actually bipartite.
The primary data, downloaded from the original website, seems to have the same problem... It seems the node indexes in the two modes (director and board) can repeat.