Closed sebastian-sz closed 2 years ago
Hey @sebastian-sz - what would this API look like for users? some of the implementation details for this preprocessing technique are a bit nuanced to me.
Does it rely on passing bounding boxes or segmentation maps to perform augmentation? Does it rely on activations of specific layers of your CNN?
Please provide these details and comment back. Thanks!
@LukeWood
what would this API look like for users?
I thought it could be used similary to tf.keras.layers.Dropout
. Example (rewrite from here):
x = tf.keras.layers.Conv2D(...)(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.ReLU()(x)
x = DropBlock(keep_probability=0.9, block_size=7)(x)
The user's would have to worry only about passing block_size
and keep_probability
parameters.
Does it rely on passing bounding boxes or segmentation maps to perform augmentation?
No, as far as I know it can be treated as a "bonus" block, as it only performs regularization. For example, the paper mentions adding it to backbone of RetinaNet, to boost final mAP.
Does it rely on activations of specific layers of your CNN?
The paper suggest's to use DropBlock for groups 3 and 4 in ResNet. The implementation uses DropBlock after ReLU in Conv->BatchNorm->ReLU block.
Sadly, this can get complicated:
Paper mentiones "Scheduled DropBlock", where instead of having keep_probability
constant, the value should start from 1 and slowly decrease (to target value) with each train step. Reference implementaion is here.
I don't think this is easy in proposed implementation, as it would somehow need access to total number of steps and current steps.
The implementation also modifies keep_probability
values depending to which group block this is applied. As I understand, this would also change with regard to 1).
This complicates proposed solution a lot, but I'm not sure how crucial these are for this layer to provide better results.
Let me know what do you think.
It could be nice If we could support Google Autodropout that generally performed better then the "fixed" Dropblock:
https://arxiv.org/abs/2101.01761
https://github.com/google-research/google-research/issues/727
Also I don't think, as you can see in the above paper, that this is strictly preprocessing/CV specific:
The learned dropout patterns also transfers to different tasks and datasets, such as from lan- guage model on Penn Treebank to Engligh-French translation on WMT 2014
What we will do if we want to reuse this in keras-nlp?
See also my previous comment at https://github.com/keras-team/keras-cv/pull/30#issuecomment-1008685239
Seems like this is a useful layer to me. As for what to do with reuse for KerasNLP so far: Transparently, we don't know yet.
@LukeWood I think the implementation can wait for preprocessing layer API refactoring?
The layer mentioned by @bhack is also interesting. Should Autodropout be implemented instead of Dropblock, or should both layers coexist?
Yes, let’s wait for the preprocessing layer refactor on this one. It should be available soon
Should Autodropout be implemented instead of Dropblock, or should both layers coexist?
I will need to read about the differences in detail before answering that one. I’m not familiar enough with the techniques yet
As I've mentioned the Github link of the reference implementaton in the Google paper is broken.
Both are landing in Pytorch vision:
Interesting, I didn't know about torchvision PR. Their proposed API looks similar to what I described above. I think it would be beneficial to also have Dropblock implemented here.
Thanks for providing so many references. This looks like a great contribution. I've added the contributions welcome label.
@sebastian-sz have you looked at the increasing DropBlock schedule that the paper recommends?
Just a reminder in the case we want to extend to the 3d case: https://github.com/pytorch/vision/pull/5416/files
@LukeWood
Yes, the example is here. It seems one would need access to total_steps
and current_step
. I'm not sure if there is an easy way to access those without having the user explicitly pass total_steps
.
@bhack Yes, 3d case is similar. It could be added in this PR or with a separate issue + PR. I can add it here, what are your opinions?
@bhack Yes, 3d case is similar. It could be added in this PR or with a separate issue + PR. I can add it here, what are your opinions?
As you like.
@bhack I'd prefer 3D variant to be added in a separate PR (perhaps separate issue). There is still some work with 2D variant.
DropBlock is a regularization technique that is more suitable for CNN's than regular dropout. Perhaps it would be beneficial to have it available in keras-cv?
Paper
Example TF Implementation