dominiek / word2vec-explorer

Tool for exploring Word Vector models
MIT License
178 stars 44 forks source link

Issues and fixes running the Demo #1

Open madhu131313 opened 7 years ago

madhu131313 commented 7 years ago

It will be great if you can add Cython in requirements.txt so that new users won't face the error. pip install cython

madhu131313 commented 7 years ago

Also for tsne, blas is missing, the following package solved that for me /usr/lib/atlas-base/atlas/libblas.so

madhu131313 commented 7 years ago

Two more things

  1. I got this error ImportError: libatlas.so.3gf: cannot open shared object file: No such file or directory I have fixed this using http://unix.stackexchange.com/a/52705

  2. routes module is missing which is used to run the demo mentioned. One can install using the following method easy_install routes

ckhung commented 7 years ago

Same as @madhu131313 . Here is what I do in summary:

apt-get install libatlas-base-dev
pip install routes
pip install -r requirements.txt

I run this inside the floydhub/dl-docker docker image.

BTW, for running inside a docker, see also #2

HenkPoley commented 7 years ago

On macOS Sierra (10.12) you invoke pip2 instead of pip.

You also have to run npm start at least once, or the UI will not have the correct styling and layout.

npm install
npm start

Even after this, you'll have to de-select and select "Show Labels", or the labels won't update when run a new query.

Additionally it might be helpful to drop the perplexity value a bit, so the tSNE library doesn't bail out when you want the explorer to list less than 100 items:

diff --git a/explorer.py b/explorer.py
index 7186eaa..35d3fa5 100644
--- a/explorer.py
+++ b/explorer.py
@@ -20,9 +20,9 @@ class Exploration(dict):
         self.stats = {}

     def reduce(self):
-        print('Performing tSNE reduction' +
+        print('Performing tSNE reduction ' +
               'on {} vectors'.format(len(self.vectors)))
-        self.reduction = bh_sne(np.array(self.vectors, dtype=np.float64))
+        self.reduction = bh_sne(np.array(self.vectors, dtype=np.float64), perplexity=5)

     def cluster(self, num_clusters=30):
         clustering = KMeans(n_clusters=num_clusters)

Additional tip, a nice new & updated word embeddings database, the '-en' version is a good drop-in for word2vec: https://github.com/commonsense/conceptnet-numberbatch

fathimakmurshida commented 5 years ago

can you please explain about how Load the explorer with a Word2Vec model??