Closed andrew-begain closed 4 years ago
Sorry for the late reply. The number of gpu does not matter. The total gpu memory matters indeed. If you do not have enough gpu memory, you can train it with batch size as few as 1. However, performance is not guaranteed. In some previous work, the authors train a network with batch size one by fixing the batch norm layer and tuning with a small learning rate. You may try it.
@MendelXu Can this only be trained on multiple Gpus? Can you train on a single GPU?thanks