Open TMorville opened 6 years ago
OK. So the switching of indexes seems to stem from lines 55-56
if rid is None:
rid = np.random.permutation(range(N))
which is later used in metis_one_level(rr,cc,vv,rid,weights)
for ii in range(N):
tid = rid[ii]
if not marked[tid]:
wmax = 0.0
rs = rowstart[tid]
marked[tid] = True
bestneighbor = -1
where the bug appears. Here N = rr[nnz-1] + 1
which is 9974 in the test data. This conflicts with the maximum value of rid
, 9999, which sets tid. So whenever the loop
for ii in range(N):
tid = rid[ii]
goes over 9974, it gives tid
a value > N, which is then referred in
marked = np.zeros(N, np.bool)
rowstart = np.zeros(N, np.int32)
rowlength = np.zeros(N, np.int32)
cluster_id = np.zeros(N, np.int32)
but all of those are of length 9974, hence the index error. Here is the print of a subsample of tid
before a crash
Value of tid: 322
Value of tid: 2881
Value of tid: 8202
Value of tid: 9726
Value of tid: 8039
Value of tid: 126
Value of tid: 276
Value of tid: 9994
fixing the above manually resolves the bug:
marked = np.zeros(10000, np.bool)
rowstart = np.zeros(10000, np.int32)
rowlength = np.zeros(10000, np.int32)
cluster_id = np.zeros(10000, np.int32)
but yields yet another.
AssertionError Traceback (most recent call last)
<ipython-input-6-7776269d0a82> in <module>()
----> 1 graphs, perm = coarsening.coarsen(ajd_sparse_ss, levels=3, self_connections=False)
~/projects/erst/graph-embedding/lib/coarsening.py in coarsen(A, levels, self_connections)
9 """
10 graphs, parents = metis(A, levels)
---> 11 perms = compute_perm(parents)
12
13 for i, A in enumerate(graphs):
~/projects/erst/graph-embedding/lib/coarsening.py in compute_perm(parents)
199 indices_node = list(np.where(parent == i)[0])
200 print("Len of indices_node", len(indices_node))
--> 201 assert 0 <= len(indices_node) <= 2
202 #print('indices_node: {}'.format(indices_node))
203
AssertionError:
which happens because the length of indices_node
is 1208, but should be either one or zero. Perhaps this need sit own tracker?
@TMorville I have the same problem. What did you end up doing?
First off, thanks for developing and sharing this interesting package. I've forked the repo and all test cases work fine. This is probably related to #14, but I've made a new issue because I have data.
I have an adjacency matrix from a large directed graph. The dimensions of the adjacency matrix are
(7919711, 7116242)
and the structure is extremely sparse, number of non-zero elements are2732656
.When I try to run the pooling on a subset
(10000x10000)
of my own data that you can find here (5.07 KB file) I can produce errors with the flavour (ran on sparse_adj_subset)And if I rerun, I get
showing that the index changes.
I am running with
graphs, perm = coarsening.coarsen(ajd_sparse_ss, levels=3, self_connections=False)
but settingself_connections=True
gives similar problems.