IBM / tensorflow-large-model-support

Large Model Support in Tensorflow
Apache License 2.0
202 stars 38 forks source link

AttributeError: module 'tensorflow._api.v2.config.experimental' has no attribute 'set_lms_enabled' #49

Closed GF-Huang closed 3 years ago

GF-Huang commented 3 years ago

image

image

smatzek commented 3 years ago

tensorflow-base needs to come from public.dhe.ibm.com as well. You may need to use strict channel priority. See the "Helping the conda installer: strict channel priority and the powerai-release package" section here: https://www.ibm.com/support/knowledgecenter/SS5SF7_1.7.0/navigation/wmlce_install.htm

GF-Huang commented 3 years ago

@smatzek Thanks your guide. By the way, which version of cudatoolkit and cudnn should I install for TFLMS?

I got InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version now.

image

jayfurmanek commented 3 years ago

You need to have a new driver. WML-CE uses CUDA 10.2, so you should have a driver compatible with that version (CUDA itself is installed as a conda package)

GF-Huang commented 3 years ago

So I need to install a new driver only or driver include cudatoolkit?

This is really a headache, I can not understand Cudatoolkit and Cuda Driver and Cuda itself in the end what is the relationship between and the difference.

jayfurmanek commented 3 years ago

You need just the driver. Although if you install the entire cuda toolkit, the driver will be installed as well.

For you understanding: There are a few different pieces that are need for the GPUs to work: The "driver" which includes:

When using WML CE (or even just bare Anaconda), the cuda piece is provided by a conda package. This package contains all the components from the cuda toolkit that are needed at runtime (it doesn't include the nvcc compilers - WML CE provides those in a package called cudatoolkit-dev)

The driver can be downloaded and installed separately, or along with the complete cudatoolkit: https://www.nvidia.com/Download/index.aspx

New drivers generally work with a certain version of CUDA, or older. So keeping the driver up to date is always a good idea.