palash1992 / GEM

BSD 3-Clause "New" or "Revised" License
1.27k stars 360 forks source link

IndexError: index 82140 is out of bounds for axis 0 with size 82140 #60

Closed sanazkh93 closed 5 years ago

sanazkh93 commented 5 years ago

Hi palash, I wanna do link prediction for slashdot dataset and when I replace slashdosh.edgelist or slashdot.gpickle with karate.edgelist or sbm.gpickle I get this error , but when I run the code with your data I don't get the errror.

File "/GEM/run_sbm.py", line 85, in Y, t = embedding.learn_embedding(graph=G, edge_f=None, is_weighted=True, no_python=True)

File "gem/embedding/gf.py", line 99, in learn_embedding [f1, f2, f] = self._get_f_value(graph)

File "gem/embedding/gf.py", line 57, in _get_f_value f1 += (w - np.dot(self._X[i, :], self._X[j, :]))**2

IndexError: index 82140 is out of bounds for axis 0 with size 82140

could you please help me with this? thank you.

palash1992 commented 5 years ago

Are your nodes numbered 0 to n-1. I believe that's not the case and would be the issue.

sanazkh93 commented 5 years ago

yes I can confirm my nodes are numbered 0 to n-1. I am using slashdot data from snap. here's the link to the data: http://snap.stanford.edu/data/soc-sign-Slashdot090221.html Please note I have reading the data and creating an edge list. I am passing the edge list to your code like karate edge list. 1soc-sign-Slashdot090221.txt

palash1992 commented 5 years ago

Firstly, are you using the latest version of GEM? Secondly, can you convert the data into the format I used for edgelist and see if you get the same issue? I have encountered this issue multiple times but it is often a problem with the input format.

dhimmel commented 4 years ago

Also just got this error when running gem.embedding.node2vec.node2vec.learn_embedding

/usr/local/lib/python3.8/site-packages/gem/embedding/node2vec.py in learn_embedding(self, graph, edge_f, is_weighted, no_python)
     82             print(str(e))
     83             raise Exception('./node2vec not found. Please compile snap, place node2vec in the system path and grant executable permission')
---> 84         self._X = graph_util.loadEmbedding('tempGraph.emb')
     85         t2 = time()
     86         return self._X, (t2 - t1)

/usr/local/lib/python3.8/site-packages/gem/utils/graph_util.py in loadEmbedding(file_name)
    168             emb = line.strip().split()
    169             emb_fl = [float(emb_i) for emb_i in emb[1:]]
--> 170             X[int(emb[0]), :] = emb_fl
    171     return X
    172 

IndexError: index 1996 is out of bounds for axis 0 with size 1994

I believe my graph is a DiGraph as described in the README. What is the root cause of the issue?

Do I need to supply the edge_f parameter for node2vec to work? edge_f stands an .edgelist file path?