I have run the demo successfully, however, I found that the GPU utilization fluctuates between 0% and 90% rapidly. And only 1 GPU working at the same time... This will significantly increase the total training time. (Now it takes approximately 8 hours on 4*v100)
I'm new to this, but based on my previous experience, this may be caused by the smaller batch size. So I tried to increase the batch size and micro batch size from 128->256, 4->64 respectively. But it doesn't work...
Hi, thanks for your great work~
I have run the demo successfully, however, I found that the GPU utilization fluctuates between 0% and 90% rapidly. And only 1 GPU working at the same time... This will significantly increase the total training time. (Now it takes approximately 8 hours on 4*v100)
I'm new to this, but based on my previous experience, this may be caused by the smaller batch size. So I tried to increase the batch size and micro batch size from 128->256, 4->64 respectively. But it doesn't work...
Is this normal, or what can I do next?