keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
1k stars 332 forks source link

Add DropBlock regularization layer. #137

Closed sebastian-sz closed 2 years ago

sebastian-sz commented 2 years ago

DropBlock is a regularization technique that is more suitable for CNN's than regular dropout. Perhaps it would be beneficial to have it available in keras-cv?

Paper

Example TF Implementation

LukeWood commented 2 years ago

Hey @sebastian-sz - what would this API look like for users? some of the implementation details for this preprocessing technique are a bit nuanced to me.

Does it rely on passing bounding boxes or segmentation maps to perform augmentation? Does it rely on activations of specific layers of your CNN?

Please provide these details and comment back. Thanks!

sebastian-sz commented 2 years ago

@LukeWood

what would this API look like for users?

I thought it could be used similary to tf.keras.layers.Dropout. Example (rewrite from here):

x = tf.keras.layers.Conv2D(...)(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.ReLU()(x)
x = DropBlock(keep_probability=0.9, block_size=7)(x)

The user's would have to worry only about passing block_size and keep_probability parameters.

Does it rely on passing bounding boxes or segmentation maps to perform augmentation?

No, as far as I know it can be treated as a "bonus" block, as it only performs regularization. For example, the paper mentions adding it to backbone of RetinaNet, to boost final mAP.

Does it rely on activations of specific layers of your CNN?

The paper suggest's to use DropBlock for groups 3 and 4 in ResNet. The implementation uses DropBlock after ReLU in Conv->BatchNorm->ReLU block.

Limitations:

Sadly, this can get complicated:

  1. Paper mentiones "Scheduled DropBlock", where instead of having keep_probability constant, the value should start from 1 and slowly decrease (to target value) with each train step. Reference implementaion is here. I don't think this is easy in proposed implementation, as it would somehow need access to total number of steps and current steps.

  2. The implementation also modifies keep_probability values depending to which group block this is applied. As I understand, this would also change with regard to 1).

This complicates proposed solution a lot, but I'm not sure how crucial these are for this layer to provide better results.

Let me know what do you think.

bhack commented 2 years ago

It could be nice If we could support Google Autodropout that generally performed better then the "fixed" Dropblock:

https://arxiv.org/abs/2101.01761

https://github.com/google-research/google-research/issues/727

Also I don't think, as you can see in the above paper, that this is strictly preprocessing/CV specific:

The learned dropout patterns also transfers to different tasks and datasets, such as from lan- guage model on Penn Treebank to Engligh-French translation on WMT 2014

What we will do if we want to reuse this in keras-nlp?

bhack commented 2 years ago

See also my previous comment at https://github.com/keras-team/keras-cv/pull/30#issuecomment-1008685239

LukeWood commented 2 years ago

Seems like this is a useful layer to me. As for what to do with reuse for KerasNLP so far: Transparently, we don't know yet.

sebastian-sz commented 2 years ago

@LukeWood I think the implementation can wait for preprocessing layer API refactoring?

The layer mentioned by @bhack is also interesting. Should Autodropout be implemented instead of Dropblock, or should both layers coexist?

LukeWood commented 2 years ago

Yes, let’s wait for the preprocessing layer refactor on this one. It should be available soon

LukeWood commented 2 years ago

Should Autodropout be implemented instead of Dropblock, or should both layers coexist?

I will need to read about the differences in detail before answering that one. I’m not familiar enough with the techniques yet

bhack commented 2 years ago

As I've mentioned the Github link of the reference implementaton in the Google paper is broken.

Both are landing in Pytorch vision:

https://github.com/pytorch/vision/pull/5416

sebastian-sz commented 2 years ago

Interesting, I didn't know about torchvision PR. Their proposed API looks similar to what I described above. I think it would be beneficial to also have Dropblock implemented here.

LukeWood commented 2 years ago

Thanks for providing so many references. This looks like a great contribution. I've added the contributions welcome label.

LukeWood commented 2 years ago

@sebastian-sz have you looked at the increasing DropBlock schedule that the paper recommends?

bhack commented 2 years ago

Just a reminder in the case we want to extend to the 3d case: https://github.com/pytorch/vision/pull/5416/files

sebastian-sz commented 2 years ago

@LukeWood Yes, the example is here. It seems one would need access to total_steps and current_step. I'm not sure if there is an easy way to access those without having the user explicitly pass total_steps.

@bhack Yes, 3d case is similar. It could be added in this PR or with a separate issue + PR. I can add it here, what are your opinions?

bhack commented 2 years ago

@bhack Yes, 3d case is similar. It could be added in this PR or with a separate issue + PR. I can add it here, what are your opinions?

As you like.

sebastian-sz commented 2 years ago

@bhack I'd prefer 3D variant to be added in a separate PR (perhaps separate issue). There is still some work with 2D variant.