Open wahrheit-git opened 6 years ago
same question. any suggestions?
@wxjeacen what i have understood until now is that the major reason for excess gpu usage or out of memory error when running with single gpu for batch size 64 is that this network has variable input size and when the input size grrows bigger, n/w requires more gpu memory.
you can test this by restricting the input scale to 320 X 320 and run batch size 64 (vgg input size is 224 X 224)
my question is quite naive. i am using this code to train and observing the gpu memory usage by the train script and it uses around 10-11 gb of gpu memory with just batch size = 16, which seems strange to me as this network is based on 19 layered darknet and is also fully convolutional (so should not be heaving so many parameters). Even VGG16 can be trained with batch size = 64 for image classification on 12 gig gpu. Can someone please help me understand what might be going wrong here.