training batch size - Githubissues

I used a batch size of 3 because of the same memory restrictions that you probably encounter. If you want to train with bigger batches, you need to reduce the image resolution. For example, you could extract crops from the image instead of training with full-frame images. There is an option for this in the Chianti C++ library. However, I have only experimented with the option briefly and haven't found it to be particularly useful.

I'm not sure about the current state of multi GPU training with theano. It used to be poorly supported. So, I can't really give advice w.r.t. using multiple GPUs.

If you consider digging into the model and doing some research with it I strongly advise you to experiment with FRRN A instead of FRRN B. This allows you to train with different batch sizes and the network converges in a reasonable amount of time (say 1.5 days on a 1080Ti). Only if you're happy with your FRRN A results, I would train on high-resolution images.

TobyPDE / FRRN

training batch size #25