Closed GeorgeBohw closed 6 years ago
Or which tag version of caffe this version use,so i can compare the code with it.
I have known the core code now!
But when i run my own caffe network,it tell me that it can reduce the memory consumption to 1.5G. Actually the memory used is up to 11G which occur error "out of memory".So i wonder whether this optimization code is just useful to some special netwok?
In my network,it has residual module....can it be resolved?
Now I found only " the split layer gather layer reshape layer and flatten layer" are optimization.How do i optimize the layer like convolution?How can i use the pervious memory?
There is no magic in engineering.
Theoretically, in training, you can only save at most 50% of memory footprint without sacrificing the speed or accuracy. The optimization we provide is a general technique that applies to any network, regardless of what operations you use. And it is not in the split and gather layers. In net.cpp
you can find the memory optimization routine.
@yjxiong Thanks much for ur response. Now i have a network which runs a module part twice,so i consider to use this optimization tech to optimize it. For example "ConvNdBackward17_data" and "ConvNdBackward190_data" respectively correspond to the first and second run of the module part. I think they can share a same memory.
Now i am trying to do that ,but failed (some errors), i wonder whether this tech can do that ,can you tell me?I found in ur code,it just optimizates it in the same layer.If do that accross layers(just like ConvNdBackward17_data and ConvNdBackward190_data),can this tech do the task?
Thank you advanced!
Just like the flow calculation which is a part of the network. In this network,i need to calculate flow(image1,image2) and flow(iamge2,image1). ConvNdBackward17_data ConvNdBackward190_data is their corresponding convolution layer.
The optimization runs in a global manner. To allow the framework to reuse memory in forwarding you have to be in testing mode and enable optimize_test
in the network setting.
Basically, if you need backpropagation for one layer, its memory cannot be reused by other layers.
Thanks!
I want to know where is the core code ,can anybody tell me about that,especially the free and malloc added by the author who optimizes it! Thank you !