Open kgarg8 opened 2 years ago
I've got the same issue, did you found a solution ?
On my side I'm just sending text lists to a pipeline in which we have the ContextualWordEmbsForSentenceAug:
import nlpaug.augmenter.word as naw
import nlpaug.augmenter.sentence as nas
import nlpaug.flow as naf
pipeline = [
naw.SynonymAug(),
naw.AntonymAug(),
naw.ContextualWordEmbsAug(),
nas.ContextualWordEmbsForSentenceAug()
]
aug = naf.Sometimes(pipeline, aug_p=1/len(pipeline), verbose=1)
res = []
for index, data in df.groupby(label_col):
aug_data = aug.augment(data[text_col].tolist(), num_thread=5)
a_data = pd.DataFrame(aug_data, columns=['text'])
a_data['label'] = index
res.append(a_data)
aug_data = pd.concat(res)
Unfortunately, no
Anyone know if this was fixed ?
It seems that this could be related with an error on the tokenizer as shown here. Mean while I have managed to over come this by passing each string individually to the augmenter:
from nlpaug.augmenter.sentence import ContextualWordEmbsForSentenceAug
aug = ContextualWordEmbsForSentenceAug()
for text in df["column"].tolist():
print(aug.augment(text, num_thread=5))
Hi,
I encounter the following error when I try to supply a batch to
nas.ContextualWordEmbsForSentenceAug
. After checking other post, I expected that supplying a list of batch_size will work but it doesn't. Any suggestions will be appreciated.Thanks