machinelearningnuremberg / WellTunedSimpleNets

[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets
Apache License 2.0
77 stars 14 forks source link

How data augmentation is applied to tabular data? #2

Closed hellowangqian closed 2 years ago

hellowangqian commented 2 years ago

Hi author, thanks for sharing the code. I'm wondering how data augmentation strategies like mix-up, cut-out, cut-mix, etc., can be applied to tabular data (I understand they are usually applied to images though). Please advise, many thanks.

perschi commented 2 years ago

When digging through the code, I found that the authors refer to CutOut and CutMix. In these two files, you can see that they replace random columns with zero or replace them with values from a random other sample in the batch.

ArlindKadra commented 2 years ago

as @perschi nicely mentioned, you could additionally find MixUp here. Basically, since we are not using an image anymore and there is no information related to neighboring features, we select features randomly.

Let me know if there is something else that is unclear :).