Closed vikeydr closed 7 years ago
Let's see:
Yes, the cython support works fairly well on *nix systems. It definitely should speed up your training.
Under the hood, it looks like this is an issue which pops up every now and then between windows and gensim (see this gensim pull request: https://github.com/piskvorky/gensim/pull/233)
If you are using the old version of gensim for deepwalk, I might suggest upgrading gensim and patching the line that crashes (see deepwalk issue https://github.com/phanein/deepwalk/issues/13 for a potential fix).
It works fine for me, thank you very much!
I also have the similiar problem on Linux 14.04 LTS, would you mind helping me for this issue? Thanks in advance.
BTW, I got Cython version 0.24
after comment cython -V
.
mji@EEE0625:~/modelzoo/deepwalk/deepwalk-1.0.2$ deepwalk --input example_graphs/karate.adjlist --output karate.embeddings
Number of nodes: 34
Number of walks: 340
Data size (walks*length): 13600
Walking...
Training...
/home/mji/anaconda2/lib/python2.7/site-packages/gensim/models/word2vec.py:406: UserWarning: Cython compilation failed, training will be slow. Do you have Cython installed? pip install cython
warnings.warn("Cython compilation failed, training will be slow. Do you have Cython installed? pip install cython
")
====================== SOLVED ===========================
BY changing the scipy version to ==0.15.1
in the requirements.txt
Thanks for posting your fix - I guess problems can arise when scipy and gensim have conflicting cython version dependencies.
Btw, greater version of scipy doesn't work:
$ pip install -r requirements.txt
Requirement already satisfied (use --upgrade to upgrade): wheel>=0.23.0 in [some path]/python2.7/site-packages (from -r requirements.txt (line 1))
Requirement already satisfied (use --upgrade to upgrade): Cython>=0.20.2 in [some path]/python2.7/site-packages (from -r requirements.txt (line 2))
Requirement already satisfied (use --upgrade to upgrade): argparse>=1.2.1 in [some path]/python2.7/site-packages (from -r requirements.txt (line 3))
Requirement already satisfied (use --upgrade to upgrade): futures>=2.1.6 in [some path]/python2.7/site-packages (from -r requirements.txt (line 4))
Requirement already satisfied (use --upgrade to upgrade): six>=1.7.3 in [some path]/python2.7/site-packages (from -r requirements.txt (line 5))
Requirement already satisfied (use --upgrade to upgrade): gensim==0.10.2 in [some path]/python2.7/site-packages (from -r requirements.txt (line 6))
Requirement already satisfied (use --upgrade to upgrade): scipy>=0.15.1 in [some path]/python2.7/site-packages (from -r requirements.txt (line 7))
Requirement already satisfied (use --upgrade to upgrade): psutil>=2.1.1 in [some path]/python2.7/site-packages (from -r requirements.txt (line 8))
Requirement already satisfied (use --upgrade to upgrade): numpy>=1.6.2 in [some path]/python2.7/site-packages (from scipy>=0.15.1->-r requirements.txt (line 7))
$ deepwalk --input example_graphs/karate.adjlist --output karate.embeddings
Number of nodes: 34
Number of walks: 340
Data size (walks*length): 13600
Walking...
Training...
[some path]/python2.7/site-packages/gensim/models/word2vec.py:406: UserWarning: Cython compilation failed, training will be slow. Do you have Cython installed? `pip install python`
warnings.warn("Cython compilation failed, training will be slow. Do you have Cython installed? `pip install python`")
Though, I can confirm that setting ==0.15.1 works.
With following versions it worked for me, on Ubuntu 14.04 Here is the output of conda list, of the environment where deepwalk worked. argparse 1.4.0 <pip> Cython 0.24.1 <pip> deepwalk 1.0.1 <pip> futures 3.0.5 <pip> gensim 0.10.2 <pip> numpy 1.11.1 <pip> openssl 1.0.2h 1 pip 8.1.2 py27_0 psutil 4.3.0 <pip> python 2.7.12 1 readline 6.2 2 scipy 0.15.1 <pip> setuptools 25.1.6 py27_0 six 1.10.0 <pip> sqlite 3.13.0 0 tk 8.5.18 0 wheel 0.29.0 py27_0 zlib 1.2.8 3
same to me, greater version of scipy doesn't work:
For those still having this same issue, I was able to use the latest version of both gensim and scipy. But I needed to make some modifications to deepwalk/main.py and gensim/models/word2vec.py for them to be compatible. For the first I changed a function call model.save_word2vec_format
to model.wv.save_word2vec_format
and for the second, I changed self.wv.syn0[i] = self.seeded_vector(self.wv.index2word[i] + str(self.seed))
to self.wv.syn0[i] = self.seeded_vector(str(self.wv.index2word[i]) + str(self.seed))
. This fixed it form me.
Thank @Uchman21 very much. I follow your idea, and it also works for me.
deepwalk/main.py
I changed a function call model.save_word2vec_format
to model.wv.save_word2vec_format
gensim/models/word2vec.py
, I changed self.wv.syn0[i] = self.seeded_vector(self.wv.index2word[i] + str(self.seed))
to self.wv.syn0[i] = self.seeded_vector(str(self.wv.index2word[i]) + str(self.seed))
. Also confirming that editing the source files as mentioned by the above two posters solved the problem for me using the latest gensim.
Also had to modify line 16 of deepwalk/__main__.py
from from skipgram import Skipgram
to from .skipgram import Skipgram
. I am using Python 3.5.
I don't think touching gensim's code is a good move, just make sure the elements are string, and gensim's word2vec will happily take it.
like what @Uchman21 did in the deepwalk/main.py
_model.save_word2vecformat to _model.wv.save_word2vecformat
then add couple str
s here and there in function _randomwalk from deepwalk/graph.py
,
def random_walk(self, path_length, alpha=0, rand=random.Random(), start=None):
""" Returns a truncated random walk.
path_length: Length of the random walk.
alpha: probability of restarts.
start: the start node of the random walk.
"""
G = self
if start:
path = [str(start)]
else:
# Sampling is uniform w.r.t V, and not w.r.t E
path = [str(rand.choice(list(G.keys())))]
while len(path) < path_length:
cur = int(path[-1])
if len(G[cur]) > 0:
if rand.random() >= alpha:
path.append(str(rand.choice(G[cur])))
else:
path.append(str(path[0]))
else:
break
return path
How to edit the gensim/models/word2vec.py file?
@aswathydiv36 We have already updated the random walk part, so you do not need to modify gensim word2vec anymore.
ok thanks
As I'm running the deepwalk, a warning occurs:
C:\Python27\lib\site-packages\gensim\models\word2vec.py:406: UserWarning: Cython compilation failed, training will be slow. Do you have Cython installed?
pip i nstall cython
warnings.warn("Cython compilation failed, training will be slow. Do you have C ython installed?pip install cython
")However, I have installed the Cython as the requirements say, version 0.23.4 Why? And how slow will it be without Cython?
Thank you for your attention