djaszak / NLPAug

A framework to simplify the usage of common Data Augmentation methods in the NLP context.
MIT License
2 stars 0 forks source link

Word Level Noising #8

Open djaszak opened 2 years ago

djaszak commented 2 years ago
djaszak commented 2 years ago
With “unigram noising”, words in the input data are replaced by another word with a certain probability. By
the method of “blank noising”, words get replaced with “_”. By the adoption of both patterns, the authors achieved
improved results in their experiments.

From "Markus Bayer, Marc-André Kaufhold, Christian Reuter (2021) A Survey on Data Augmentation for Text Classification"