Hi! The original configuration is carried out on 4 V100 GPUs (32GB). For Colab cases, you need to manually reduce the batch size to half or a quarter of the original one. I guess the time of Colab is not enough for a full training, so you'd better reduce the epoch to 1 to observe some initial results.
Could you elaborate your problem a little bit more?
Hi! The original configuration is carried out on 4 V100 GPUs (32GB). For Colab cases, you need to manually reduce the batch size to half or a quarter of the original one. I guess the time of Colab is not enough for a full training, so you'd better reduce the epoch to 1 to observe some initial results.
Could you elaborate your problem a little bit more?