jhlau / doc2vec

Python scripts for training/testing paragraph vectors
Apache License 2.0
640 stars 191 forks source link

Forked version of gensim #4

Closed chaitjo closed 7 years ago

chaitjo commented 7 years ago

I understand that in order to use pretrained word embeddings to train the doc2vec models, we should install your forked version of gensim. However, I have failed to configure it properly with a C compiler. (MINGW in my case, could be BLAS etc.) I'm on windows and am using Anaconda.

I tried installing straight from setup.py, using pip, and even created a conda package and installed using that. Each time, I did succeed in installing the forked version of gensim.

However, every time I tried the following commands in a python shell-

>>> import gensim
>>> gensim.models.word2vec.FAST_VERSION

...the output was a -1. This meant that training would be very, very slow. (70 times slower iirc.)

How do I get your version and still retain the link to the C compiler? (If I install the current distributed version using conda install gensim, it is linked to my MINGW.)

jhlau commented 7 years ago

Hi, this isn't a forked version issue. Please post it on the canonical gensim forum.

chaitjo commented 7 years ago

My apologies for posting here. I solved the issue by editing all the files you changed. I modified word2vec.py and doc2vec.py according to the forked version's commit history and it seems to be working perfectly well right now.

drussellmrichie commented 6 years ago

Having the same issue. @chaitjo I realize you did this 1.5 years ago, but do you remember more specifically what you did? I do have a c compiler that my master version of gensim is able to see. Seems it's just a matter of getting this forked version of gensim to see it...?

chaitjo commented 6 years ago

Hey @drussellmrichie. I'm obviously not 100% sure about this, but here's what I think I did 1.5 years ago:

  1. Installed gensim normally (not @jhlau's fork) so that it used the desired C compiler.
  2. Manually edited the two files in the package by copying the changes made here: https://github.com/jhlau/gensim/commits/develop
  3. It worked!

Hope it helps!

drussellmrichie commented 6 years ago

Oh, I see! I might be able to reproduce that, then. Thanks for the tip!

On Sat, May 12, 2018 at 7:32 AM, Chaitanya Joshi notifications@github.com wrote:

Hey @drussellmrichie https://github.com/drussellmrichie. I'm obviously not 100% sure about this, but here's what I think I did 1.5 years ago:

  1. Installed gensim normally (not @jhlau https://github.com/jhlau's fork)
  2. Manually edited the two files in the package by copying the changes made here: https://github.com/jhlau/gensim/commits/develop
  3. It worked!

Hope it helps!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jhlau/doc2vec/issues/4#issuecomment-388548893, or mute the thread https://github.com/notifications/unsubscribe-auth/AJNCKX3UErh_kqch89uA_QvLXop_1La2ks5txshkgaJpZM4LIi27 .

IIKshitiz26II commented 5 years ago

Hey @chaitjo, Did you get the error "Cannot import name Word2VecVocab"? How did you fix it?