nateraw / Lda2vec-Tensorflow

Tensorflow 1.5 implementation of Chris Moody's Lda2vec, adapted from @meereeum
MIT License
107 stars 40 forks source link

Add requirements.txt #27

Closed nateraw closed 4 years ago

nateraw commented 5 years ago

Would really like some help with this...I don't have any experience setting up the requirements.txt file for a package. Here are the versions of the packages I'm using to run this:

pandas (0.21.1)
numpy (1.16.2)
sklearn (0.0)
tensorflow-gpu (1.5.0)
pyLDAvis (2.1.2)
Keras (2.1.4)
spacy (2.0.11)
tqdm (4.23.4)

Also - Note that by default, the tokenizer uses spacy's "en_core_web_sm" language model. It would be nice if on install from pip, it automatically would install spacy as well as the "en_core_web_sm" model or prompted the user somehow to do so.

nateraw commented 5 years ago

@dbl001 in my dev branch I did everything under a new clean environment. I think this is exactly what I needed to do in order to set it up (if you have conda). If you don't have conda, you'll have to translate out parts of this to pip. If you get the chance to checkout the commit I tagged you in, maybe try to set up this environment using these commands and the updated requirements.txt file under the dev branch.

### Create new conda environemnt
conda create --name lda2vec_test python=3.5

### Activate new environment
source activate lda2vec_test

### Add conda forge so you can install spacy
conda config --add channels conda-forge

### Install my spacy version
conda install spacy=2.0.11

### Install spacy language model
sudo pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-2.0.0/en_core_web_lg-2.0.0.tar.gz --trusted-host github.com

### Install the rest of the requirements
sudo pip install -r requirements.txt
nateraw commented 5 years ago

Note - Windows users would say activate lda2vec_test. source activate is a Unix command.

nateraw commented 5 years ago

@dbl001 were you able to set up the environment successfully and run it? If not, I may probe a little more to assure it works before updating readme with instal directions.

Also, if this all works out - I may take the package off of pypi. Doesn't seem relevant to pip install this, as users probably want to play around with parameters. What do you think about that?

nateraw commented 5 years ago

On mac I had to take tensorflow-gpu==1.5.0 out of requirements.txt and pip install https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.5.0-py3-none-any.whl

This wasn't an issue on Linux. Not sure how to mitigate this issue without Dockerizing the repo.

dbl001 commented 5 years ago

I ran the script on my Mac.

Not sure why iPython started as python 2.76 Pandas was missing.

$ conda create --name lda2vec_test python=3.5
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.5.13
  latest version: 4.6.11

Please update conda by running

    $ conda update -n base -c defaults conda

## Package Plan ##

  environment location: /Users/davidlaxer/anaconda/envs/lda2vec_test

  added / updated specs: 
    - python=3.5

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ncurses-6.1                |    h0a44026_1002         1.3 MB  conda-forge
    zlib-1.2.11                |    h1de35cc_1004         101 KB  conda-forge
    libedit-3.1.20181209       |       hb402a30_0         159 KB
    tk-8.6.9                   |    ha441bb4_1001         3.2 MB  conda-forge
    python-3.5.6               |       hc167b69_0        14.9 MB
    openssl-1.0.2r             |       h1de35cc_0         3.0 MB  conda-forge
    libcxxabi-4.0.1            |       hcfea43d_1         458 KB
    pip-18.0                   |        py35_1001         1.8 MB  conda-forge
    libcxx-4.0.1               |       hcfea43d_1         1.2 MB
    wheel-0.32.0               |        py35_1000          34 KB  conda-forge
    sqlite-3.27.2              |       ha441bb4_0         2.3 MB
    readline-7.0               |    hcfe32e1_1001         393 KB  conda-forge
    certifi-2018.8.24          |        py35_1001         139 KB  conda-forge
    libffi-3.2.1               |    h6de7cb9_1006          43 KB  conda-forge
    setuptools-40.4.3          |           py35_0         573 KB  conda-forge
    xz-5.2.4                   |    h1de35cc_1001         268 KB  conda-forge
    ------------------------------------------------------------
                                           Total:        29.8 MB

The following NEW packages will be INSTALLED:

    ca-certificates: 2019.3.9-hecc5488_0     conda-forge
    certifi:         2018.8.24-py35_1001     conda-forge
    libcxx:          4.0.1-hcfea43d_1                   
    libcxxabi:       4.0.1-hcfea43d_1                   
    libedit:         3.1.20181209-hb402a30_0            
    libffi:          3.2.1-h6de7cb9_1006     conda-forge
    ncurses:         6.1-h0a44026_1002       conda-forge
    openssl:         1.0.2r-h1de35cc_0       conda-forge
    pip:             18.0-py35_1001          conda-forge
    python:          3.5.6-hc167b69_0                   
    readline:        7.0-hcfe32e1_1001       conda-forge
    setuptools:      40.4.3-py35_0           conda-forge
    sqlite:          3.27.2-ha441bb4_0                  
    tk:              8.6.9-ha441bb4_1001     conda-forge
    wheel:           0.32.0-py35_1000        conda-forge
    xz:              5.2.4-h1de35cc_1001     conda-forge
    zlib:            1.2.11-h1de35cc_1004    conda-forge

Proceed ([y]/n)? 
### Activate new environment
source activate lda2vec_test

### Add conda forge so you can install spacy
conda config --add channels conda-forge

### Install my spacy version
conda install spacy=2.0.11

### Install spacy language model
sudo pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-2.0.0/en_core_web_lg-2.0.0.tar.gz

### Install the rest of the requirements
sudo pip install -r requirements.txt

Downloading and Extracting Packages
ncurses-6.1          | 1.3 MB    | ##################################### | 100% 
zlib-1.2.11          | 101 KB    | ##################################### | 100% 
libedit-3.1.20181209 | 159 KB    | ##################################### | 100% 
tk-8.6.9             | 3.2 MB    | ##################################### | 100% 
python-3.5.6         | 14.9 MB   | ##################################### | 100% 
openssl-1.0.2r       | 3.0 MB    | ##################################### | 100% 
libcxxabi-4.0.1      | 458 KB    | ##################################### | 100% 
pip-18.0             | 1.8 MB    | ##################################### | 100% 
libcxx-4.0.1         | 1.2 MB    | ##################################### | 100% 
wheel-0.32.0         | 34 KB     | ##################################### | 100% 
sqlite-3.27.2        | 2.3 MB    | ##################################### | 100% 
readline-7.0         | 393 KB    | ##################################### | 100% 
certifi-2018.8.24    | 139 KB    | ##################################### | 100% 
libffi-3.2.1         | 43 KB     | ##################################### | 100% 
setuptools-40.4.3    | 573 KB    | ##################################### | 100% 
xz-5.2.4             | 268 KB    | ##################################### | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use:
# > source activate lda2vec_test
#
# To deactivate an active environment, use:
# > source deactivate
#

MacBook-Pro:Lda2vec-Tensorflow davidlaxer$ ### Activate new environment
MacBook-Pro:Lda2vec-Tensorflow davidlaxer$ source activate lda2vec_test
(lda2vec_test) MacBook-Pro:Lda2vec-Tensorflow davidlaxer$ 
(lda2vec_test) MacBook-Pro:Lda2vec-Tensorflow davidlaxer$ ### Add conda forge so you can install spacy
(lda2vec_test) MacBook-Pro:Lda2vec-Tensorflow davidlaxer$ conda config --add channels conda-forge
Warning: 'conda-forge' already in 'channels' list, moving to the top
(lda2vec_test) MacBook-Pro:Lda2vec-Tensorflow davidlaxer$ 
(lda2vec_test) MacBook-Pro:Lda2vec-Tensorflow davidlaxer$ ### Install my spacy version
(lda2vec_test) MacBook-Pro:Lda2vec-Tensorflow davidlaxer$ conda install spacy=2.0.11
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.5.13
  latest version: 4.6.11

Please update conda by running

    $ conda update -n base -c defaults conda

## Package Plan ##

  environment location: /Users/davidlaxer/anaconda/envs/lda2vec_test

  added / updated specs: 
    - spacy=2.0.11

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ujson-1.35                 |py35h1de35cc_1001          24 KB  conda-forge
    preshed-1.0.1              |   py35hfc679d8_0          70 KB  conda-forge
    cymem-1.31.2               |   py35hfc679d8_0          22 KB  conda-forge
    spacy-2.0.11               |   py35hf8a1672_1        40.2 MB  conda-forge
    libgfortran-3.0.1          |                0         495 KB  conda-forge
    termcolor-1.1.0            |             py_2           6 KB  conda-forge
    tqdm-4.31.1                |             py_0          40 KB  conda-forge
    regex-2017.11.09           |           py35_0         815 KB  conda-forge
    dill-0.2.8.2               |           py35_0         110 KB  conda-forge
    toolz-0.9.0                |             py_1          42 KB  conda-forge
    msgpack-python-0.5.6       |   py35h2d50403_3          97 KB  conda-forge
    wrapt-1.10.11              |   py35h470a237_1          42 KB  conda-forge
    msgpack-numpy-0.4.4.2      |             py_1          10 KB  conda-forge
    numpy-base-1.15.2          |   py35ha711998_0         4.0 MB
    numpy-1.15.2               |   py35h926163e_0          47 KB
    plac-0.9.6                 |             py_1          18 KB  conda-forge
    cytoolz-0.9.0.1            |   py35h470a237_0         407 KB  conda-forge
    libopenblas-0.3.3          |       hdc02c5d_3         8.4 MB
    thinc-6.10.3               |   py35hf8a1672_3         1.5 MB  conda-forge
    murmurhash-0.28.0          |py35h0a44026_1000          15 KB  conda-forge
    six-1.11.0                 |           py35_1          21 KB  conda-forge
    ------------------------------------------------------------
                                           Total:        56.4 MB

The following NEW packages will be INSTALLED:

    blas:           2.4-openblas             conda-forge
    cymem:          1.31.2-py35hfc679d8_0    conda-forge
    cytoolz:        0.9.0.1-py35h470a237_0   conda-forge
    dill:           0.2.8.2-py35_0           conda-forge
    libblas:        3.8.0-4_openblas         conda-forge
    libcblas:       3.8.0-4_openblas         conda-forge
    libgfortran:    3.0.1-0                  conda-forge
    liblapack:      3.8.0-4_openblas         conda-forge
    liblapacke:     3.8.0-4_openblas         conda-forge
    libopenblas:    0.3.3-hdc02c5d_3                    
    msgpack-numpy:  0.4.4.2-py_1             conda-forge
    msgpack-python: 0.5.6-py35h2d50403_3     conda-forge
    murmurhash:     0.28.0-py35h0a44026_1000 conda-forge
    numpy:          1.15.2-py35h926163e_0               
    numpy-base:     1.15.2-py35ha711998_0               
    openblas:       0.3.5-h436c29b_1001      conda-forge
    plac:           0.9.6-py_1               conda-forge
    preshed:        1.0.1-py35hfc679d8_0     conda-forge
    regex:          2017.11.09-py35_0        conda-forge
    six:            1.11.0-py35_1            conda-forge
    spacy:          2.0.11-py35hf8a1672_1    conda-forge
    termcolor:      1.1.0-py_2               conda-forge
    thinc:          6.10.3-py35hf8a1672_3    conda-forge
    toolz:          0.9.0-py_1               conda-forge
    tqdm:           4.31.1-py_0              conda-forge
    ujson:          1.35-py35h1de35cc_1001   conda-forge
    wrapt:          1.10.11-py35h470a237_1   conda-forge

Proceed ([y]/n)? 
### Install spacy language model
sudo pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-2.0.0/en_core_web_lg-2.0.0.tar.gz

### Install the rest of the requirements
sudo pip install -r requirements.txt

Downloading and Extracting Packages
ujson-1.35           | 24 KB     | ##################################### | 100% 
preshed-1.0.1        | 70 KB     | ##################################### | 100% 
cymem-1.31.2         | 22 KB     | ##################################### | 100% 
spacy-2.0.11         | 40.2 MB   | ##################################### | 100% 
libgfortran-3.0.1    | 495 KB    | ##################################### | 100% 
termcolor-1.1.0      | 6 KB      | ##################################### | 100% 
tqdm-4.31.1          | 40 KB     | ##################################### | 100% 
regex-2017.11.09     | 815 KB    | ##################################### | 100% 
dill-0.2.8.2         | 110 KB    | ##################################### | 100% 
toolz-0.9.0          | 42 KB     | ##################################### | 100% 
msgpack-python-0.5.6 | 97 KB     | ##################################### | 100% 
wrapt-1.10.11        | 42 KB     | ##################################### | 100% 
msgpack-numpy-0.4.4. | 10 KB     | ##################################### | 100% 
numpy-base-1.15.2    | 4.0 MB    | ##################################### | 100% 
numpy-1.15.2         | 47 KB     | ##################################### | 100% 
plac-0.9.6           | 18 KB     | ##################################### | 100% 
cytoolz-0.9.0.1      | 407 KB    | ##################################### | 100% 
libopenblas-0.3.3    | 8.4 MB    | ##################################### | 100% 
thinc-6.10.3         | 1.5 MB    | ##################################### | 100% 
murmurhash-0.28.0    | 15 KB     | ##################################### | 100% 
six-1.11.0           | 21 KB     | ##################################### | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
(lda2vec_test) MacBook-Pro:Lda2vec-Tensorflow davidlaxer$ ### Install spacy language model
(lda2vec_test) MacBook-Pro:Lda2vec-Tensorflow davidlaxer$ sudo pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-2.0.0/en_core_web_lg-2.0.0.tar.gz
Password:
Sorry, try again.
Password:
The directory '/Users/davidlaxer/Library/Caches/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/Users/davidlaxer/Library/Caches/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-2.0.0/en_core_web_lg-2.0.0.tar.gz
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-2.0.0/en_core_web_lg-2.0.0.tar.gz (852.3MB)
    100% |████████████████████████████████| 852.3MB 7.2kB/s 
Requirement already satisfied: spacy>=2.0.0a18 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from en-core-web-lg==2.0.0) (2.0.11)
Requirement already satisfied: numpy>=1.7 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from spacy>=2.0.0a18->en-core-web-lg==2.0.0) (1.15.2)
Requirement already satisfied: murmurhash<0.29,>=0.28 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from spacy>=2.0.0a18->en-core-web-lg==2.0.0) (0.28.0)
Requirement already satisfied: cymem<1.32,>=1.30 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from spacy>=2.0.0a18->en-core-web-lg==2.0.0) (1.31.2)
Requirement already satisfied: preshed<2.0.0,>=1.0.0 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from spacy>=2.0.0a18->en-core-web-lg==2.0.0) (1.0.1)
Requirement already satisfied: thinc<6.11.0,>=6.10.1 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from spacy>=2.0.0a18->en-core-web-lg==2.0.0) (6.10.3)
Requirement already satisfied: plac<1.0.0,>=0.9.6 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from spacy>=2.0.0a18->en-core-web-lg==2.0.0) (0.9.6)
Collecting pathlib (from spacy>=2.0.0a18->en-core-web-lg==2.0.0)
  Downloading https://files.pythonhosted.org/packages/ac/aa/9b065a76b9af472437a0059f77e8f962fe350438b927cb80184c32f075eb/pathlib-1.0.1.tar.gz (49kB)
    100% |████████████████████████████████| 51kB 1.1MB/s 
Requirement already satisfied: ujson>=1.35 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from spacy>=2.0.0a18->en-core-web-lg==2.0.0) (1.35)
Requirement already satisfied: dill<0.3,>=0.2 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from spacy>=2.0.0a18->en-core-web-lg==2.0.0) (0.2.8.2)
Collecting regex==2017.4.5 (from spacy>=2.0.0a18->en-core-web-lg==2.0.0)
  Downloading https://files.pythonhosted.org/packages/36/62/c0c0d762ffd4ffaf39f372eb8561b8d491a11ace5a7884610424a8b40f95/regex-2017.04.05.tar.gz (601kB)
    100% |████████████████████████████████| 604kB 6.4MB/s 
Requirement already satisfied: msgpack<1.0.0,>=0.5.6 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from thinc<6.11.0,>=6.10.1->spacy>=2.0.0a18->en-core-web-lg==2.0.0) (0.5.6)
Requirement already satisfied: msgpack-numpy<1.0.0,>=0.4.1 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from thinc<6.11.0,>=6.10.1->spacy>=2.0.0a18->en-core-web-lg==2.0.0) (0.4.4.2)
Requirement already satisfied: cytoolz<0.10,>=0.9.0 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from thinc<6.11.0,>=6.10.1->spacy>=2.0.0a18->en-core-web-lg==2.0.0) (0.9.0.1)
Requirement already satisfied: wrapt<1.11.0,>=1.10.0 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from thinc<6.11.0,>=6.10.1->spacy>=2.0.0a18->en-core-web-lg==2.0.0) (1.10.11)
Requirement already satisfied: tqdm<5.0.0,>=4.10.0 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from thinc<6.11.0,>=6.10.1->spacy>=2.0.0a18->en-core-web-lg==2.0.0) (4.31.1)
Requirement already satisfied: six<2.0.0,>=1.10.0 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from thinc<6.11.0,>=6.10.1->spacy>=2.0.0a18->en-core-web-lg==2.0.0) (1.11.0)
Requirement already satisfied: toolz>=0.8.0 in /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/site-packages (from cytoolz<0.10,>=0.9.0->thinc<6.11.0,>=6.10.1->spacy>=2.0.0a18->en-core-web-lg==2.0.0) (0.9.0)
Installing collected packages: en-core-web-lg, pathlib, regex
  Running setup.py install for en-core-web-lg ... done
  Running setup.py install for pathlib ... done
  Found existing installation: regex 2017.11.9
    Uninstalling regex-2017.11.9:
      Successfully uninstalled regex-2017.11.9
  Running setup.py install for regex ... done
Successfully installed en-core-web-lg-2.0.0 pathlib-1.0.1 regex-2017.4.5
(lda2vec_test) MacBook-Pro:Lda2vec-Tensorflow davidlaxer$ cd tests/twenty_newsgroups/
(lda2vec_test) MacBook-Pro:twenty_newsgroups davidlaxer$ python load_20newsgroups.py
Traceback (most recent call last):
  File "load_20newsgroups.py", line 1, in <module>
    import pandas as pd
ImportError: No module named 'pandas'
(lda2vec_test) MacBook-Pro:twenty_newsgroups davidlaxer$ ipython
Python 2.7.6 |Anaconda custom (64-bit)| (default, Nov 11 2013, 10:49:09) 
Type "copyright", "credits" or "license" for more information.

IPython 5.5.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: quit()
(lda2vec_test) MacBook-Pro:twenty_newsgroups davidlaxer$ 
nateraw commented 5 years ago

Doesn't look like you ever actually ran requirements.txt for some reason :sweat_smile: . Also - my guess is that on mac, installing requirements.txt will break for you when you go to install it. You'll have to do what I said in the earlier comment about removing tensorflow from requirements.txt and running that other command manually.

It's only Mac that seems to have that issue...strange. Works fine on my Linux box.

dbl001 commented 5 years ago

conda and pip don’t always ‘play nicely together’.

After I ran: sudo pip install -r requirements.txt

load_20newsgroups.py got exceptions trying to import:

pandas, scikit-learn, tears, and pyLDAvis

pip installed pandas==0.21.1 E.g. …

Collecting pandas==0.21.1 (from -r requirements.txt (line 2)) Downloading https://files.pythonhosted.org/packages/54/3e/816df3ff52b805038743c8e15a48e67524ecad9f9b597e2d10c61073cc7a/pandas-0.21.1-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (14.9MB) 100% |████████████████████████████████| 14.9MB 1.2MB/s … $ python load_20newsgroups.py Traceback (most recent call last): File "load_20newsgroups.py", line 1, in import pandas as pd ImportError: No module named 'pandas'

I had to install it with conda:

$ conda install pandas Solving environment: done

==> WARNING: A newer version of conda exists. <== current version: 4.5.13 latest version: 4.6.11

Please update conda by running

$ conda update -n base -c defaults conda

Package Plan

environment location: /Users/davidlaxer/anaconda/envs/lda2vec_test

added / updated specs:

The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
python-dateutil-2.8.0      |             py_0         219 KB  conda-forge
pandas-0.23.4              |   py35hf8a1672_0        10.1 MB  conda-forge
pytz-2018.9                |             py_0         229 KB  conda-forge
------------------------------------------------------------
                                       Total:        10.6 MB

The following NEW packages will be INSTALLED:

pandas:          0.23.4-py35hf8a1672_0 conda-forge
python-dateutil: 2.8.0-py_0            conda-forge
pytz:            2018.9-py_0           conda-forge

Proceed ([y]/n)? y

Downloading and Extracting Packages python-dateutil-2.8. | 219 KB | ##################################### | 100% pandas-0.23.4 | 10.1 MB | ##################################### | 100% pytz-2018.9 | 229 KB | ##################################### | 100% Preparing transaction: done Verifying transaction: done Executing transaction: done

Same issue with scikit-learn, keras and ldaPYvis. Anyway, it’s running now …

$ python load_20newsgroups.py /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: compiletime version 3.6 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.5 return f(*args, **kwds) Using TensorFlow backend.

---------- Tokenizing Texts ---------- 879it [01:00, 15.31s/it]

On Apr 4, 2019, at 9:56 PM, Nathan Raw notifications@github.com wrote:

Doesn't look like you ever actually ran requirements.txt for some reason 😅 . Also - my guess is that on mac, installing requirements.txt will break for you when you go to install it. You'll have to do what I said in the earlier comment about removing tensorflow from requirements.txt and running that other command manually.

It's only Mac that seems to have that issue...strange. Works fine on my Linux box.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nateraw/Lda2vec-Tensorflow/issues/27#issuecomment-480148288, or mute the thread https://github.com/notifications/unsubscribe-auth/AC9i26wOLrh-xxtv44xzvpwhnmgFBgg8ks5vdteMgaJpZM4bgObT.

dbl001 commented 5 years ago

FileNotFoundError: [Errno 2] No such file or directory: '/media/dlmachine/SSD_2/embeddings/glove.6B.300d.txt'

On Apr 4, 2019, at 11:12 PM, David Laxer davidl@softintel.com wrote:

conda and pip don’t always ‘play nicely together’.

After I ran: sudo pip install -r requirements.txt

load_20newsgroups.py got exceptions trying to import:

pandas, scikit-learn, tears, and pyLDAvis

pip installed pandas==0.21.1 E.g. …

Collecting pandas==0.21.1 (from -r requirements.txt (line 2)) Downloading https://files.pythonhosted.org/packages/54/3e/816df3ff52b805038743c8e15a48e67524ecad9f9b597e2d10c61073cc7a/pandas-0.21.1-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl https://files.pythonhosted.org/packages/54/3e/816df3ff52b805038743c8e15a48e67524ecad9f9b597e2d10c61073cc7a/pandas-0.21.1-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (14.9MB) 100% |████████████████████████████████| 14.9MB 1.2MB/s … $ python load_20newsgroups.py Traceback (most recent call last): File "load_20newsgroups.py", line 1, in import pandas as pd ImportError: No module named 'pandas'

I had to install it with conda:

$ conda install pandas Solving environment: done

==> WARNING: A newer version of conda exists. <== current version: 4.5.13 latest version: 4.6.11

Please update conda by running

$ conda update -n base -c defaults conda

Package Plan

environment location: /Users/davidlaxer/anaconda/envs/lda2vec_test

added / updated specs:

  • pandas

The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
python-dateutil-2.8.0      |             py_0         219 KB  conda-forge
pandas-0.23.4              |   py35hf8a1672_0        10.1 MB  conda-forge
pytz-2018.9                |             py_0         229 KB  conda-forge
------------------------------------------------------------
                                       Total:        10.6 MB

The following NEW packages will be INSTALLED:

pandas:          0.23.4-py35hf8a1672_0 conda-forge
python-dateutil: 2.8.0-py_0            conda-forge
pytz:            2018.9-py_0           conda-forge

Proceed ([y]/n)? y

Downloading and Extracting Packages python-dateutil-2.8. | 219 KB | ##################################### | 100% pandas-0.23.4 | 10.1 MB | ##################################### | 100% pytz-2018.9 | 229 KB | ##################################### | 100% Preparing transaction: done Verifying transaction: done Executing transaction: done

Same issue with scikit-learn, keras and ldaPYvis. Anyway, it’s running now …

$ python load_20newsgroups.py /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: compiletime version 3.6 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.5 return f(*args, **kwds) Using TensorFlow backend.

---------- Tokenizing Texts ---------- 879it [01:00, 15.31s/it]

On Apr 4, 2019, at 9:56 PM, Nathan Raw <notifications@github.com mailto:notifications@github.com> wrote:

Doesn't look like you ever actually ran requirements.txt for some reason 😅 . Also - my guess is that on mac, installing requirements.txt will break for you when you go to install it. You'll have to do what I said in the earlier comment about removing tensorflow from requirements.txt and running that other command manually.

It's only Mac that seems to have that issue...strange. Works fine on my Linux box.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nateraw/Lda2vec-Tensorflow/issues/27#issuecomment-480148288, or mute the thread https://github.com/notifications/unsubscribe-auth/AC9i26wOLrh-xxtv44xzvpwhnmgFBgg8ks5vdteMgaJpZM4bgObT.

dbl001 commented 5 years ago

This worked:

### Create new conda environment
conda create --name lda2vec_test python=3.5

### Activate new environment
source activate lda2vec_test

### Add conda forge so you can install spacy
conda config --add channels conda-forge

### Install my spacy version
conda install spacy=2.0.11

pip install https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.5.0-py3-none-any.whl

### Install the rest of the requirements
sudo pip install -r requirements.txt

### Install spacy language model
sudo pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-2.0.0/en_core_web_lg-2.0.0.tar.gz

python setup.py develop
dbl001 commented 5 years ago

Warnings:

/Users/davidlaxer/Lda2vec-Tensorflow/lda2vec/nlppipe.py:129: FutureWarning: arrays to stack must be passed as a "sequence" type such as list or tuple. Support for non-sequence iterables such as generators is deprecated as of NumPy 1.16 and will raise an error in the future. all_embs = np.stack(embeddings_index.values())

/Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: compiletime version 3.6 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.5 return f(*args, **kwds)

On Apr 5, 2019, at 2:45 AM, David Laxer davidl@softintel.com wrote:

FileNotFoundError: [Errno 2] No such file or directory: '/media/dlmachine/SSD_2/embeddings/glove.6B.300d.txt'

On Apr 4, 2019, at 11:12 PM, David Laxer <davidl@softintel.com mailto:davidl@softintel.com> wrote:

conda and pip don’t always ‘play nicely together’.

After I ran: sudo pip install -r requirements.txt

load_20newsgroups.py got exceptions trying to import:

pandas, scikit-learn, tears, and pyLDAvis

pip installed pandas==0.21.1 E.g. …

Collecting pandas==0.21.1 (from -r requirements.txt (line 2)) Downloading https://files.pythonhosted.org/packages/54/3e/816df3ff52b805038743c8e15a48e67524ecad9f9b597e2d10c61073cc7a/pandas-0.21.1-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl https://files.pythonhosted.org/packages/54/3e/816df3ff52b805038743c8e15a48e67524ecad9f9b597e2d10c61073cc7a/pandas-0.21.1-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (14.9MB) 100% |████████████████████████████████| 14.9MB 1.2MB/s … $ python load_20newsgroups.py Traceback (most recent call last): File "load_20newsgroups.py", line 1, in import pandas as pd ImportError: No module named 'pandas'

I had to install it with conda:

$ conda install pandas Solving environment: done

==> WARNING: A newer version of conda exists. <== current version: 4.5.13 latest version: 4.6.11

Please update conda by running

$ conda update -n base -c defaults conda

Package Plan

environment location: /Users/davidlaxer/anaconda/envs/lda2vec_test

added / updated specs:

  • pandas

The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
python-dateutil-2.8.0      |             py_0         219 KB  conda-forge
pandas-0.23.4              |   py35hf8a1672_0        10.1 MB  conda-forge
pytz-2018.9                |             py_0         229 KB  conda-forge
------------------------------------------------------------
                                       Total:        10.6 MB

The following NEW packages will be INSTALLED:

pandas:          0.23.4-py35hf8a1672_0 conda-forge
python-dateutil: 2.8.0-py_0            conda-forge
pytz:            2018.9-py_0           conda-forge

Proceed ([y]/n)? y

Downloading and Extracting Packages python-dateutil-2.8. | 219 KB | ##################################### | 100% pandas-0.23.4 | 10.1 MB | ##################################### | 100% pytz-2018.9 | 229 KB | ##################################### | 100% Preparing transaction: done Verifying transaction: done Executing transaction: done

Same issue with scikit-learn, keras and ldaPYvis. Anyway, it’s running now …

$ python load_20newsgroups.py /Users/davidlaxer/anaconda/envs/lda2vec_test/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: compiletime version 3.6 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.5 return f(*args, **kwds) Using TensorFlow backend.

---------- Tokenizing Texts ---------- 879it [01:00, 15.31s/it]

On Apr 4, 2019, at 9:56 PM, Nathan Raw <notifications@github.com mailto:notifications@github.com> wrote:

Doesn't look like you ever actually ran requirements.txt for some reason 😅 . Also - my guess is that on mac, installing requirements.txt will break for you when you go to install it. You'll have to do what I said in the earlier comment about removing tensorflow from requirements.txt and running that other command manually.

It's only Mac that seems to have that issue...strange. Works fine on my Linux box.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nateraw/Lda2vec-Tensorflow/issues/27#issuecomment-480148288, or mute the thread https://github.com/notifications/unsubscribe-auth/AC9i26wOLrh-xxtv44xzvpwhnmgFBgg8ks5vdteMgaJpZM4bgObT.

nateraw commented 5 years ago

Whoops - thought I sent a reply to you earlier but must not have hit send. The embeddings path was left in there by accident. That's where I keep my embeddings on my computer so I don't have to keep copying them to every repo I use them in. I'll make sure to push a quick switch for that.

As for the mac vs windows install stuff, I know the issues on both, I just don't know the best way to solve the problem. Looking into some solutions. Main solution would probably be to just use a newer version of TF instead of the outdated 1.5.0. Problem is my hardware restricts me to 1.5.0 due to my CPU not being compatible with AVX. Will think on this a bit more...

dbl001 commented 5 years ago

My CPU on my Mac doesn’t support AVX either. I’ve running Tensorflow 1.12 and I think also 1.18. I think I installed it from Conda. Stay tuned.

On Apr 5, 2019, at 3:21 PM, Nathan Raw notifications@github.com wrote:

Whoops - thought I sent a reply to you earlier but must not have hit send. The embeddings path was left in there by accident. That's where I keep my embeddings on my computer so I don't have to keep copying them to every repo I use them in. I'll make sure to push a quick switch for that.

As for the mac vs windows install stuff, I know the issues on both, I just don't know the best way to solve the problem. Looking into some solutions. Main solution would probably be to just use a newer version of TF instead of the outdated 1.5.0. Problem is my hardware restricts me to 1.5.0 due to my CPU not being compatible with AVX. Will think on this a bit more...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

nateraw commented 5 years ago

Interesting...did you have to pass flags to make it work?? Maybe the AVX stuff is built into that mac specific installer. When v1.6 got released I looked into how to install it but couldn't get it to work, so I haven't tried since. CPU only is obviously not an option.

dbl001 commented 5 years ago

I may have built Tensorflow from source on my Mac.

What are the 2-3 character tokens in the word to topic output that are not words and do appear in the original document file?

On Apr 5, 2019, at 3:54 PM, Nathan Raw notifications@github.com wrote:

Interesting...did you have to pass flags to make it work?? Maybe the AVX stuff is built into that mac specific installer. When v1.6 got released I looked into how to install it but couldn't get it to work, so I haven't tried since. CPU only is obviously not an option.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

nateraw commented 5 years ago

No idea. Haven't had time to look. Will be able to tonight/tomorrow.

nateraw commented 5 years ago

This works for me and should solve all of our problems. The following are instructions for building a Docker environment on Ubuntu 16.04. If you're on a different OS, you'll have to navigate to the instructions for your relevant OS (from the sidebars in the links provided)

First, install docker CE. I'm on Ubuntu, so I followed these instructions

Next, if you're on Linux and want to use GPU, you have to install nvidia-docker by following the instructions in the readme.

Once you're finished, you can build/run the tensorflow 1.5 GPU Docker Image from tensorflow/tensorflow

sudo docker run -it --rm --runtime=nvidia tensorflow/tensorflow:1.5.0-gpu-py3 bash

After you have this working, we should be able to standardize the install across platforms.

sudheernaidu53 commented 5 years ago

pip install -r requirements.txt ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'

dbl001 commented 5 years ago

Is this happening in Docker?

On Jul 3, 2019, at 12:01 AM, SUDHEER NAIDU notifications@github.com wrote:

pip install -r requirements.txt ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nateraw/Lda2vec-Tensorflow/issues/27?email_source=notifications&email_token=AAXWFW2AKIXKS4FA42E3X43P5RFD5A5CNFSM4G4A43J2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZDPPGI#issuecomment-507967385, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXWFWZ4JY4QFY7IDEFN3JTP5RFD5ANCNFSM4G4A43JQ.