TobyPDE / FRRN

Full Resolution Residual Networks for Semantic Image Segmentation
MIT License
278 stars 93 forks source link

training batch size #25

Closed manuel-88 closed 7 years ago

manuel-88 commented 7 years ago

What batch size did you use for training? I use a GeForce GTX 1080 Ti and can use a maximum batch size of 3. Isn't it better to use larger batch sizes for training on convolutional networks?

Is it possible to do multi gpu training and can I increase the batch size if I use multi gpu?

So many questions :) hope you can help me

TobyPDE commented 7 years ago

I used a batch size of 3 because of the same memory restrictions that you probably encounter. If you want to train with bigger batches, you need to reduce the image resolution. For example, you could extract crops from the image instead of training with full-frame images. There is an option for this in the Chianti C++ library. However, I have only experimented with the option briefly and haven't found it to be particularly useful.

I'm not sure about the current state of multi GPU training with theano. It used to be poorly supported. So, I can't really give advice w.r.t. using multiple GPUs.

If you consider digging into the model and doing some research with it I strongly advise you to experiment with FRRN A instead of FRRN B. This allows you to train with different batch sizes and the network converges in a reasonable amount of time (say 1.5 days on a 1080Ti). Only if you're happy with your FRRN A results, I would train on high-resolution images.