makcedward / nlpaug

Data augmentation for NLP
https://makcedward.github.io/
MIT License
4.41k stars 460 forks source link

Consistent data type (list) for char augmenters #302

Closed fratambot closed 2 years ago

fratambot commented 2 years ago

First of all let me thank you for this amazing library ! I am using nlpaug.augmenter.char to augment my text data in different ways and I noticed that when I call the .augment() method with n=1 it returns a string:

import nlpaug.augmenter.char as nac
aug_typo = nac.KeyboardAug(...)
augmentations = aug_typo.augment("my string", n=1)
print(type(augmentations))

>> <class 'str'>

while for more than 1 augmentations it returns a list of strings:

import nlpaug.augmenter.char as nac
aug_typo = nac.KeyboardAug(...)
augmentations = aug_typo.augment("my string", n=2)
print(type(augmentations))

>> <class 'list'>

It would be nice to have always the same type of output, i.e. even with only 1 augmentation having a list with one string inside.

I don't know if it's going to produce regressions but as it is right now I have to check first if the output is a string or a list and in the latter case loop over the list to get the strings inside.

Many thanks in advance and have a nice day ! :)

makcedward commented 2 years ago

By design, the return type is the same as the input type. I agree that consistency is important and it will be enhanced in next release.