Queries regarding Contextual Word Embeddings Augmenter [BERT, etc.]

makcedward / nlpaug

Data augmentation for NLP

MIT License

4.43k stars 462 forks source link

Hi Edward,

First of all, it's a great piece of work created and open-sourced by you! Thanks a lot.

While using Contextual Word Embeddings - say BERT, DistilBERT, when I pass just one word and select action = "insert", then it adds a word before/after depending on context.

When I choose action = "substitute" - for n = 3

Q1. Could you help me understand why does this method output the same output for n times for most uni/bi-grams?

Q2. The PPDB outputs look like ['pretty', 'wonderful', 'lovely'] for the word "beautiful". How do we achieve a similar functionality through contextualized word embeddings? Why do the outputs repeat themselves for uni/bi-grams?

It would be great if you could advise on the above or give any directions.

Regards, Paritosh

makcedward / nlpaug

Queries regarding Contextual Word Embeddings Augmenter [BERT, etc.] #219