Open fliptrail opened 4 years ago
I am encountering this exact issue on Tensorflow=2.0.0 https://github.com/tensorflow/tensorflow/issues/30321 Possible solution is given here.
Yes.. the possible solution is in above mentioned link. Read more about "model parallelism vs data parallelism".
Hello, As the title suggests, I am unable to train this model on multiple gpu configuration. I am trying to train it on 4 RTX 2080 Ti. It loads up the model only on the 1st GPU utilizing a memory of around 10.5 GB/11 GB For the remaining GPU's, it is only utilizing a memory of 155 MB/11 GB. Also, the training speed is independent of the number of GPU's selected by me using
CUDA_VISIBLE_DEVICES
. So, apparently it is only using the 1st GPU. I tried diving in the code to find out the exact function multi_gpu_model, but everything seemed fine to me. So, can you confirm or tell me how to train this implementation over multiple GPU's?