IBM / tensorflow-large-model-support

Large Model Support in Tensorflow
Apache License 2.0
202 stars 38 forks source link

Installation Problems #34

Closed schdief06 closed 4 years ago

schdief06 commented 4 years ago

Hi,

I'm struggeling to install tensorflow-gpu from the WML CE channel. I added the channel by tiping

conda config --prepend channels https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/

After installation via conda (even with --strict-channel-priority) the following line fails, because there is no attribute named 'set_lms_enabled'.

tf.config.experimental.set_lms_enabled(True)

Is there any way to force conda to use the WML CE channel while installation? Or what am I doing wrong??

Thanks for helping!

schdief06 commented 4 years ago

I just noticed: When tiping pip list there is no package named tensorflow-gpu. Just tensorflow. Is this supposed to be like that? It wasn't in the versions before.

smatzek commented 4 years ago

Issue #29 had similar problems. You should use conda list instead of pip list as conda will show pip packages but pip will not show conda packages.

You may want or need to start with a new conda environment. You should follow the instructions here: https://www.ibm.com/support/knowledgecenter/SS5SF7_1.7.0/navigation/wmlce_install.html , and verify that tensorflow, tensorflow-gpu, tensorflow-base, and tensorflow-estimator are or will be installed from the WML CE channel. I suspect that one of them is not installed from the channel.

schdief06 commented 4 years ago

Thanks for your answer. I've just started a new python environment, following the isntructions you linked. This are all the commands I used to create and install the tensorflow.

conda create tf-lms2
conda activate tf-lms2
conda install --strict-channel-priority tensorflow-gpu

Now I'm getting following error in VisualStudio:

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy c-extensions failed.
- Try uninstalling and reinstalling numpy.
- If you have already done that, then:
  1. Check that you expected to use Python3.7 from "D:\anaconda3\envs\tf-lms2\python.exe",
     and that you have no directories in your PATH or PYTHONPATH that can
     interfere with the Python and numpy version "1.18.1" you're trying to use.
  2. If (1) looks fine, you can open a new issue at
     https://github.com/numpy/numpy/issues.  Please include details on:
     - how you installed Python
     - how you installed numpy
     - your operating system
     - whether or not you have multiple versions of Python installed
     - if you built from source, your compiler versions and ideally a build log

- If you're working with a numpy git repository, try git clean -xdf
  (removes all files not under version control) and rebuild numpy.

Note: this error has many possible causes, so please don't comment on
an existing issue about this - open a new one instead.

Last time I did a pip uninstall numpy, pip install numpy==1.18.1 which solves the error on importing numpy, but leads to the error I descirbed above.

Here is a output of conda list:

_tflow_select             2.1.0                       gpu
absl-py                   0.9.0                    py37_0
asn1crypto                1.3.0                    py37_0
astor                     0.8.0                    py37_0
blas                      1.0                         mkl
blinker                   1.4                      py37_0
ca-certificates           2020.1.1                      0
cachetools                3.1.1                      py_0
certifi                   2020.4.5.1               py37_0
cffi                      1.14.0           py37h7a1dbc1_0
chardet                   3.0.4                 py37_1003
click                     7.1.1                      py_0
cryptography              2.8              py37h7a1dbc1_0
cudatoolkit               10.1.243             h74a9793_0
cudnn                     7.6.5                cuda10.1_0
gast                      0.2.2                    py37_0
google-auth               1.13.1                     py_0
google-auth-oauthlib      0.4.1                      py_2
google-pasta              0.2.0                      py_0
grpcio                    1.27.2           py37h351948d_0
h5py                      2.10.0           py37h5e291fa_0
hdf5                      1.10.4               h7ebc959_0
icc_rt                    2019.0.0             h0cc432a_1
idna                      2.9                        py_1
intel-openmp              2020.0                      166
keras-applications        1.0.8                      py_0
keras-preprocessing       1.1.0                      py_1
libprotobuf               3.11.4               h7bd577a_0
markdown                  3.1.1                    py37_0
mkl                       2020.0                      166
mkl-service               2.3.0            py37hb782905_0
mkl_fft                   1.0.15           py37h14836fe_0
mkl_random                1.1.0            py37h675688f_0
numpy                     1.18.1           py37h93ca92e_0
numpy-base                1.18.1           py37hc3f5095_1
oauthlib                  3.1.0                      py_0
openssl                   1.1.1g               he774522_0
opt_einsum                3.1.0                      py_0
pip                       20.0.2                   py37_1
protobuf                  3.11.4           py37h33f27b4_0
pyasn1                    0.4.8                      py_0
pyasn1-modules            0.2.7                      py_0
pycparser                 2.20                       py_0
pyjwt                     1.7.1                    py37_0
pyopenssl                 19.1.0                   py37_0
pyreadline                2.1                      py37_1
pysocks                   1.7.1                    py37_0
python                    3.7.7                h60c2a47_2
requests                  2.23.0                   py37_0
requests-oauthlib         1.3.0                      py_0
rsa                       4.0                        py_0
scipy                     1.4.1            py37h9439919_0
setuptools                46.1.3                   py37_0
six                       1.14.0                   py37_0
sqlite                    3.31.1               h2a8f88b_1
tensorboard               2.1.0                     py3_0
tensorflow                2.1.0           gpu_py37h7db9008_0
tensorflow-base           2.1.0           gpu_py37h55f5790_0
tensorflow-estimator      2.1.0              pyhd54b08b_0
tensorflow-gpu            2.1.0                h0d30ee6_0
termcolor                 1.1.0                    py37_1
urllib3                   1.25.8                   py37_0
vc                        14.1                 h0510ff6_4
vs2015_runtime            14.16.27012          hf0eaf9b_1
werkzeug                  0.14.1                   py37_0
wheel                     0.34.2                   py37_0
win_inet_pton             1.1.0                    py37_0
wincertstore              0.2                      py37_0
wrapt                     1.12.1           py37he774522_1
zlib                      1.2.11               h62dcd97_4

Thansk for helping. I'm just clueless how to solve this mess... :/

smatzek commented 4 years ago

The error message you pasted helped me figure out what is going on. WML CE is only provided for Linux. It seems you are using Windows given your Visual Studio reference and the "D:\anaconda3" path.

From other user's attempts we know that the 2.1.0 patch does not compile on Windows and we haven't built the still-in-development 2.2.0 patch with Windows either.

smatzek commented 4 years ago

I've closed this issue since the root cause was non-support for Windows, but if you have more questions about LMS you can feel free to ask them, or open other issues if you retry on Linux and encounter problems.

schdief06 commented 4 years ago

Yes, I'm on a Windows machine. I didn't know it's not supported. Will it be in the future? Anyway.... I guess I'll have to switch to a Linux machine then...