tensorflow / models

Models and examples built with TensorFlow
77.18k stars 45.75k forks source link

#Textsum# How to use Multi-GPUs during training? #530

Closed licaoyuan123 closed 8 years ago

licaoyuan123 commented 8 years ago


In seq2seq_attention.py, the following code defined there is 0 gpu to be used by default: tf.app.flags.DEFINE_integer('num_gpus', 0, 'Number of gpus used.') I tried to run the code on a machine with 4 GPUs, so I changed the number to 4: tf.app.flags.DEFINE_integer('num_gpus', 4, 'Number of gpus used.') However, it caused the following error message:

tensorflow/core/client/tensor_c_api.cc:485] Cannot assign a device to node 'seq2seq/encoder3/BiRNN_BW/BiRNN_BW/Max': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available. [[Node: seq2seq/encoder3/BiRNN_BW/BiRNN_BW/Max = Max[T=DT_INT32, keep_dims=false, _device="/device:GPU:0"](article_lens, seq2seq/encoder3/BiRNN_BW/BiRNN_BW/Const_1)]]

In the _Train(model, data_batcher) function, the first sentence is with tf.device('/cpu:0'): does it mean only cpu will be used during training? I tried to modify this code to 'with tf.device('/cpu:0'):' but it reported the similar error as above

So how can I use multi GPU to run the training task? Thank you.

In seq2seq_attention.py, I add parameter to sv.prepare_or_wait_for_session: sess = sv.prepare_or_wait_for_session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)) Is this the correct way to use multi GPU?

jart commented 8 years ago

Is it possible that this is related to tensorflow/tensorflow#2285?

aselle commented 8 years ago

Automatically closing due to lack of recent activity, please reopen when further information becomes available.