BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.04k stars 18.7k forks source link

Explicitly disable backward propagation of a layer for controlled fine tuning #389

Closed kloudkl closed 10 years ago

kloudkl commented 10 years ago

The Google video classification CNN explored four transfer learning methods training from scratch, fine-tuning top layer (classifier), fine-tuning top 3 layers, and fine-tuning all layers [1]. Fine-tuning specific layers keeps the generic features of the other layers untouched during training. They found that fine-tuning top 3 layers performed best.

It is not very straightforward to reason about whether the backward propagation of a layer is disabled or not in Caffe as shown in #100 and #103. So it would be nice to be able explicitly disable that.

[1] Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei. Large-Scale Video Classification with Convolutional Neural Networks. CVPR 2014.

shelhamer commented 10 years ago

@jeffdonahue has an improved backward interface in the works. Jeff, how about adding an optional repeated field for the back propagation flags? Does that fit neatly into your new init logic that determines the vector of propagation flags?

Le lundi 5 mai 2014, kloudkl notifications@github.com a écrit :

The Google video classification CNN explored four transfer learning methods training from scratch, fine-tuning top layer(classifier), fine-tuning top 3 layers, and fine-tuning all layers [1]. Fine-tuning specific layers keeps the generic features of the other layers untouched during training. They found that fine-tuning top 3 layers performed best.

It is not very straightforward to reason about whether the backward propagation of a layer is disabled in Caffe as shown in #100https://github.com/BVLC/caffe/issues/100and

103 https://github.com/BVLC/caffe/pull/103. So it would be nice to be

able explicitly disable it.

[1] Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei. Large-Scale Video Classification with Convolutional Neural Networks. CVPR 2014.

— Reply to this email directly or view it on GitHubhttps://github.com/BVLC/caffe/issues/389 .

Evan Shelhamer

jeffdonahue commented 10 years ago

I believe that you can already do this in Caffe by setting blobs_lr: 0.0 in all layers you won't finetune (need two of that line if the layer has biases), and then their backward passes won't be computed, unless you have layers under them with non-zero blobs_lr. I could another bool parameter to LayerParameter called something like force_no_backward as well, but I'm not sure how to handle the case of a force_no_backward layer having weights (with blobs_lr>0) below it.

shelhamer commented 10 years ago

Right. What I'm suggesting is a field not for weight blobs but for bottoms to act as a vector of flags, one per bottom, to dictate whether backpropagation should continue to that bottom.

If it overcomplicates the logic we can leave it as an issue for now.

shelhamer commented 10 years ago

Closing since this is already supported by blobs_lr.

HoldenCaulfieldRye commented 10 years ago

If blobs_lr is set to 0, does that actually prevent the partial derivatives from being computed? If the GPU is computing them but then updating the weights by 0, it seems like a very hacky and expensive way to go about....

shelhamer commented 10 years ago

It does prevent all the unnecessary computation. It's not a hack at all. This is just how we signify that further backpropagation is unnecessary in our model definitions. If you inspect the output during model construction you will see Caffe decide where to backpropagate and not.

See Net::Init() for the details: https://github.com/BVLC/caffe/blob/master/src/caffe/net.cpp#L32-L171

On Wed, Aug 13, 2014 at 9:26 AM, Alexandre Dalyac notifications@github.com wrote:

If blobs_lr is set to 0, does that actually prevent the partial derivatives from being computed? If the GPU is computing them but then updating the weights by 0, it seems like a very hacky and expensive way to go about....

— Reply to this email directly or view it on GitHub https://github.com/BVLC/caffe/issues/389#issuecomment-52073006.

HoldenCaulfieldRye commented 10 years ago

ah ok, sorry guys. nice job on keeping the UI simple then!