Closed innat closed 11 months ago
cc. @chjort
If we want to support this, I suggest we merge it into RandomCutout
by giving it an additional argument like num_cutouts
or something. @LukeWood wdyt?
Yeah, I'd prefer a num_cutouts
parameter... perhaps supporting a range to randomly sample from.
hi, I would like to contribute to this task.
So from the reference link provided above, what I inferred is num_cutouts
stand for number of cutouts that must be cut from the image provided, right?
absolutely! And I'd also like this to support a range.
So you can have
CutOut(num_cutouts=(1, 10))
Then it will randomly sample from 1, 10 and perform that many cutouts.
@LukeWood what will those cutouts do? From the link provided, I understood it's done to reduce overfitting . If I'm not wrong. And how will those have an impact on the accuracy though? And is this kind of data augmentation? I want to try this one and contribute if this is still open.
I think this can be closed now
@quantumalaviya Thanks for this PR. One query and concern on this
for _ in tf.range(self._sample_num_cutouts())
Isn't loping through the num of cutouts causing performance issues? It's the same if you use tf.map_fn
, under the hood, it also does something like that, I think. @parikshit14 did some benchmark, HERE, is it reproduced?
cc. @LukeWood @bhack
Yeah, I mentioned it in the PR (#207). I was waiting on @parikshit14 for the changes.
I'll just go ahead and try to implement the changes myself based on #186.
I have the changes ready with me just needed a green signal to push them.
But changes to vectorize num_cutouts
in fill_rectangles
and rectangle_mask
will make it incompatible for CutMix
. So to counter this we have the following options(non-exhastive)
we can create FLAG sort of a variable to separate both of these(RandomCutout and CutMix) inside fill_utils
we can create a separate file for the vectorized fill_utils
.
what should we prefer?
I wonder if the changes can be generalized to include only 1 rectangle.
I have the changes ready with me just needed a green signal to push them. But changes to vectorize
num_cutouts
infill_rectangles
andrectangle_mask
will make it incompatible forCutMix
. So to counter this we have the following options(non-exhastive)
- we can create FLAG sort of a variable to separate both of these(RandomCutout and CutMix) inside
fill_utils
- we can create a separate file for the vectorized
fill_utils
.what should we prefer?
It should still be compatible with CutMix if we just set num_cutouts in fill_rectanges=1, right? We will have to make a change to both in the PR though. Thanks!
yes @LukeWood , already did the num_cutouts=1 for cut_mix in PR #217
This is done, closing as @parikshit14 handled this.
@LukeWood cc. @parikshit14 Was it added? That PR 217 closed. If it's not added yet, could you please re-open the ticket?
I think we had to roll it back, so sure thing we can reopen
Let's deprioritize this unless there's a strong use case.
@LukeWood Could you please elaborate how to determine strong use case? Like, In kaggle it's quite popular but I'm not sure if it's the right metrics in terms of use case. Are you expecting more user or anything specific?
Interesting @innat - I did not realize this. What advantage does this provide over CutMix? Do people tend to find stronger performance?
@LukeWood I think, Cutmix and Mixup type augmentaiton can't be used in regression model, at least in a straightforward way, until the regression is remodel to classification type.
About advantage of cutout over cutmix, don't know for sure though, but I think it actually depends on the dataset. If we go through the top solutions of kaggle cv related competition, we would find good amount of cutout layer is used, i.e. example 1, example 2, etc.
It's kinda an alternate of cutout augmentation but with more options.
tf-code reference. https://www.kaggle.com/cdeotte/tfrecord-experiments-upsample-and-coarse-dropout
Demo.