Closed OzanSahin92 closed 9 months ago
Could not reproduce this, Network.nsi_betweenness()
works fine for me on Networks of up to 10000 nodes (I'm limited in Network size by my old machine's ram here).
Assuming this has been resolved by @ntfrgl's recent fixes in the Cython/C modules, and hence closing the issue.
@fkuehlein, I don't think that you have provided sufficient reason for closing this issue: The work you mentioned on the Cython/C method signatures wouldn't concern an indexing error deeply inside a Cython function, and you could in principle, say, spin up a moderately sized Amazon Elastic Compute Cloud instance to obtain enough RAM for reproducing the issue.
Nonetheless, I agree with closing the issue for now. My commit above removes some obvious slowdowns in the method, after which I successfully tested it on a random network with slightly higher node and edge counts than the bug report:
from pyunicorn import Network
net = Network.Model("ErdosRenyi", n_nodes=26000, link_probability=.2)
btw = net.nsi_betweenness(parallelize=True)
The explanation suggested by the issue author is unfortunately incorrect: As is now explicitly asserted, the array flat_predecessors
is of size E = 2 * n_links
, which is the sum of all degrees (both for directed and undirected networks) rather than the number of edges.
In contrast, the error message above appears to have n_links = 61761383
and E = 123522767 = 2 * n_links + 1
, which might indicate an inconsistent data structure produced by the user's script. Otherwise, it is of course still conceivable that there remains an uncovered edge case in the implementation, but we would probably need the exact adjacency matrix to reproduce it.
Right, I was a bit too quick to conclude there..
All the more thanks for your speedup work and for providing good reference and documentation!
I am currently trying to calculate the nsi_betweenness of my directed graph. printing my network object gives me
Network: undirected, 25600 nodes, 61761383 links, link density 0.188.
I tried to find the error with gdb and it seems to happen because of indexerrors in the cython function _nsi_betweenness. I think that the problem lies in the indexing via the degree. Some arrays in the function are of size [0,E] with E being the number of edges/links and indexing is done via the sum of degrees, which can be of size [0,2*E]. that results in a segmentation error.
running gdp python on the core dump file lead to this:
running gdb python again with the _nsi_betweenness function now pythonized lead to this: