Memory optimization referring the fork (https://github.com/yjxiong/caffe/tree/mem)

Please use the caffe-users list for usage, installation, or modeling questions, or other requests for help. Do not post such requests to Issues. Doing so interferes with the development of Caffe.

Please read the guidelines for contributing before submitting this issue.

Issue summary

Due to memory shortage, I cannot run SENet (the recent ILSVRC 2017 winning model ; https://github.com/hujie-frank/SENet), with my GTX 1080ti.

I found that jujie-frank recommended to use memory-optimized branch of caffe. (https://github.com/hujie-frank/SENet/issues/6)

jujie-frank recommended to use yjxiong's caffe branch. (https://github.com/yjxiong/caffe/tree/mem)

This branch use "MemoryOptimize_v2" function in net.cpp, and this optimization reduced the usages of GPU memory, so I could increase train_batch 20~30% in case of VGG-19 and ResNet-152.

yjxiong wrote some article about the details of memory optimization. http://blog.yjxiong.me/archives/403 https://github.com/yjxiong/caffe/wiki/Memory-Optimization

I found these optimization were not apllied to BVLC's main branch, so I patched with recent(15~30 days ago) BVLC caffe.

patch BVLCcaffe_optimized.zip

I think adding "MemoryOptimize_v2" function to the BVLC branch would be helpful to the users with limited GPU memory, if it is not a problem to import the source of yjxiong without permission.

Your system configuration

Operating system: ubuntu 16.04 Compiler: CUDA version (if applicable): 8.0 CUDNN version (if applicable): 7.0 BLAS: openblas Python or MATLAB version (for pycaffe and matcaffe respectively): 2.7

BVLC / caffe

Memory optimization referring the fork (https://github.com/yjxiong/caffe/tree/mem) #6059

Issue summary

Your system configuration