Closed spadavec closed 6 years ago
Hi Spadavec, our implementation is based on Keras + Tensorflow. If you are not getting your GPU used is is probably because, one of these two is not recognizing your GPU.
Have you verified that tensorflow/keras is using your GPU in other settings?(https://www.tensorflow.org/programmers_guide/using_gpu)
@beangoben yes, Keras + TF work on my GPU for other codebases. TF seems to sugget that it can see my GPU as well:
>>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
2018-02-14 16:48:53.514406: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0
2018-02-14 16:48:53.515305: I tensorflow/core/common_runtime/direct_session.cc:257] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0
Keras also is setup for TF:
cat $HOME/.keras/keras.json
{
"epsilon": 1e-07,
"floatx": "float32",
"image_data_format": "channels_last",
"backend": "tensorflow",
"device": "gpu0"
}
Although it seems that there is a potential version mismatch:
python -c 'import keras; print(keras.__version__)'
/home/spadavec/miniconda2/envs/chemvae/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
/home/spadavec/miniconda2/envs/chemvae/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
return f(*args, **kwds)
2.0.7
I installed everything via Anaconda; is it possible the environment.yml file isn't up-to-date?
The environment.yml specifies keras=2.0.6, which is believe you are using 2.0.7..don't know if that is affecting anything. Could you could try importing tensorflow, keras (check on the gpu) and then chem_vae and see if that works?
@beangoben sorry for the confusion, but you want me to do the following (this from the zinc
directory in the repo)?
(chemvae) spadavec@turing:~/chemical_vae/models/zinc$ python
Python 3.6.4 | packaged by conda-forge | (default, Dec 23 2017, 16:31:06)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
/home/spadavec/miniconda2/envs/chemvae/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
return f(*args, **kwds)
im>>> import keras
/home/spadavec/miniconda2/envs/chemvae/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
>>> import chemvae
>>> from chemvae import train_vae
>>>
my understanding is that this should launch the train_vae
code, but it doesn't
I just did a 'clean' install on a new machine without using conda install, and it seems to work now. I'll leave this issue open for now, and once I figure out what the conflict is, I'll post here and close out. If you want to close this out now, I understand!
great! Will close if there is no additional related questions.
While running the
train_vae
script, apparently my GPU isn't being used (the CPU usage is 300%+, but the GPU seems to be unused). Mykeras.json
file specifies that the backend is tensorflow, and theKERAS_BACKEND
env variable is also set to tensorflow. Is there something else I can do to use my GPU for training?