tensorflow / models

Models and examples built with TensorFlow
Other
77.18k stars 45.76k forks source link

Why are multiple-gpu slower than single gpu #5238

Closed YOUYOUYOU closed 6 years ago

YOUYOUYOU commented 6 years ago

Please go to Stack Overflow for help and support:

http://stackoverflow.com/questions/tagged/tensorflow

Also, please understand that many of the models included in this repository are experimental and research-style code. If you open a GitHub issue, here is our policy:

  1. It must be a bug, a feature request, or a significant problem with documentation (for small docs fixes please send a PR instead).
  2. The form below must be filled out.

Here's why we have that policy: TensorFlow developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.


System information

You can collect some of this information using our environment capture script:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

You can obtain the TensorFlow version with

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Describe the problem

Describe the problem clearly here. Be sure to convey here why it's a bug in TensorFlow or a feature request.

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem. I trained textsum at 4-gpus and 1-gpu ,I found 4-gpus is slower than single gpu , and they all have low utilization, anyone can help me resolve this problem?

k-w-w commented 6 years ago

Make sure that you are comparing number of batches that are trained per step -- 4 GPUs could mean that the model is training 4x the number of batches each step, so each step could take longer but more examples are trained.

This question is better asked on StackOverflow since it is not a bug or feature request. There is also a larger community that reads questions there.

If you think we've misinterpreted a bug, please comment again with a clear explanation, as well as all of the information requested in the issue template.