tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial
Apache License 2.0
6.38k stars 1.96k forks source link

GPU training not working with Deep learning AMI P3 instance #397

Closed mohammedayub44 closed 6 years ago

mohammedayub44 commented 6 years ago

I'm using AWS Deep learning Ubuntu 15.0 P3 instance. For some reason my tensorflow_p36 environment is not able to detect the 4 GPU instances. I have ran the test gpu command and it return false. import tensorflow as tf tf.test.is_gpu_available() returns False

Below is the result of my nvidia-smi output:

image

I also tried the below command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m nmt.nmt --src=en --tgt=es --vocab_prefix=/home/ubuntu/mukund_traildata/vocab --train_prefix=/home/ubuntu/mukund_traildata/new_train --dev_prefix=/home/ubuntu/mukund_traildata/new_dev --test_prefix=/home/ubuntu/mukund_traildata/new_testing --out_dir=/home/ubuntu/mukund_traildata/models/model3 --num_gpus=1 --num_train_steps=1200 --steps_per_stats=100 --num_layers=4 --num_units=128 --dropout=0.2 --metrics=bleu --log_device_placement=True

There is no GPU action:

image

Let me know if I'm missing any configurations. Appreciate any help.

Mohammed Ayub