lancopku / text-autoaugment

[EMNLP 2021] Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification
https://arxiv.org/abs/2109.00523
MIT License
125 stars 16 forks source link

The return in augmentation.py cannot be usead as data source #9

Closed mvllwong closed 2 years ago

mvllwong commented 2 years ago

The return of each transform function in augmentation.py should be Str, instead of List, which is generated by default. e.g.: def random_word_delete(text, m): return aug.augment(text) ⬇⬇⬇⬇⬇⬇⬇⬇⬇ def random_word_delete(text, m): return aug.augment(text)[0]

mvllwong commented 2 years ago

https://github.com/lancopku/text-autoaugment/blob/39173c76218510e7be813d0b61e2bf3777487a3d/taa/augmentation.py#L11-L56

RenShuhuai-Andy commented 2 years ago

Hi, sorry for the late reply.

According to https://github.com/makcedward/nlpaug/blob/master/nlpaug/augmenter/word/random.py#L161, the return of transform functions is Str when input is Str, while the return is List when input is List.

Our code in https://github.com/lancopku/text-autoaugment/blob/main/taa/data.py#L91-L111 ensures the output aug_texts is a List.