Open AIROBOTAI opened 7 years ago
Hi @AIROBOTAI
It suppots multiple GPUs training. The idea is to reset the upper layer memories during backpropagation and reallocate them before forwardpropagation. Besides, the batchnorm layers need to much memories for data transformation, and can be optimized to only use half of the original memories.
Please search MEMOPT, it will give you some clue for using this code.
Hi Chris,
I accidentally come across your codes for optimizing caffe memory. Does it support training on multiple GPUs? Can you show me how to use it? Thanks!