I tried to add multi-gpu training with the accelerate package to this repository but have issues reproducing the same results. Here the things I have already tried:
I reduced the batch size accordingly with the number of GPUs.
I also went the other way and increased the batch size in the old code and compared then multi-gpu vs. single-gpu
I replaced Batch normalization layers with Synchronized Batch Normalization layers.
Do you have some idea what could be the cause of that and what is required to make multi-gpu work?
PS: I gave you star for the good support on the last licence question ;-)
I tried to add multi-gpu training with the
accelerate
package to this repository but have issues reproducing the same results. Here the things I have already tried:Do you have some idea what could be the cause of that and what is required to make multi-gpu work?
PS: I gave you star for the good support on the last licence question ;-)