Closed pengpaiSH closed 7 years ago
This part of the blog post is about model parallelism, data parallelism is discussed in other issues:
You could find a concise description here.
@tboquet Thank you for your provided materials which me clear about the two parallelism fashions
in ConvNets. So, currently, the data parallelism in Keras (I mean the simplified APIs) is still ongoing, right?
Right! You could use model parallelism with Tensorflow but there is no unified Keras api to do this. For data parallelism, you could take a look at https://github.com/mila-udem/platoon if you want to get inspiration to develop your own solution.
@fchollet Thanks for the example for multi gpu in tensorflow,
I have a 2 Tinan X PC
I tried the code
with tf.device('/cpu:0'):
x = tf.placeholder(tf.float32, shape=(None, 784))
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=784))
model.add(Dense(10, activation='softmax'))
with tf.device('/gpu:0'):
output_0 = model(x) # all ops in the replica will live on GPU:0
with tf.device('/gpu:1'):
output_1 = model(x) # all ops in the replica will live on GPU:1
with tf.device('/cpu:0'):
preds = 0.5 * (output_0 + output_1)
output_value = sess.run([preds], feed_dict={x: data})
However, if I print
print output_0
print output_1
the output gives
Tensor("Softmax_28:0", shape=(?, 10), dtype=float32, device=/device:GPU:0) Tensor("Softmax_28:0", shape=(?, 10), dtype=float32, device=/device:GPU:0)
It seems that only the first device scope is active and only one GPU is used
Obviously, I am missing something. Any help would be appreciated
It's not clear how to actually run the above examples with multi-gpu, do we call keras preds.fit instead of the usual model.fit?
@rollingstone: I am experiencing the same with that example. My devices appear to be executing sequentially?
Any pointers on why this is happening?
@fchollet and other Keras fans: Does Keras support data parallelism (with TensorFlow as backend) right now? I have one machine with 4 GPUs, I would like to do data parallelism which makes convergence faster, i.e. batch_size could be set larger.
ping
On Sat, Aug 20, 2016 at 5:47 AM, Pai Peng notifications@github.com wrote:
@fchollet https://github.com/fchollet and other Keras fans: Does Keras support data parallelism (with TensorFlow as backend) right now? I have one machine with 4 GPUs, I would like to do data parallelism which makes convergence faster, i.e. batch_size could be set larger.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fchollet/keras/issues/3258#issuecomment-241174390, or mute the thread https://github.com/notifications/unsubscribe-auth/AFdLCONqQl3L0G8Xyohgh6ahXQS8I_kwks5qhmqsgaJpZM4JPqbP .
@rollingstone I'm experience the same problem. Any update?
@kaigang I am also expecting update!
Any updates?
@mongoose54 No updates any more. @fchollet has already confirmed that fit_distributed
won't appear in the next version of Keras. However, the good news is that TensorFlow will officially support Keras since Version 1.2. This video shows how to use tf.keras
to train a VQA model and it claims that distributed fashion is no more a concern.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
@fchollet has provided an excellent blog about using Keras as a simplified interface for TensorFlow. In the end of that blog, it introduces the way to use multiple GPUs to train a model.
What I am confused is that why do we have to do model averaging in cpu0? My understanding of using multiple GPUs is that Keras could automatically train the model by computing gradients and updating weights more quickly compared with only one replica. In other words, the learning convergence should be more fast. Please correct me if I am wrong. And if Keras could handle such automatic multi-GPU learning fashion, what is the simplest way to implement it (perhaps in just a few lines of codes) ?