jind11 / TextFooler

A Model for Natural Language Attack on Text Classification and Inference
MIT License
485 stars 79 forks source link

Missing vocabularies for wordCNN and wordLSTM pretrained models #18

Open mahossam opened 4 years ago

mahossam commented 4 years ago

I tried to load the pretrained wordCNN/LSTM, but I found that the embedding layer uses 400K vocab with 200 embeddings dimensions. It seems that you use the wikipedia pretrained embeddings from https://nlp.stanford.edu/projects/glove/. However, you mentioned before in this reply that you used 10K and 20K vocabs for CNN and LSTM models.

Could you please explain which vocab sizes for did you use for the published results in the paper ? And if possible, could you provide the 10-20K word vocabularies used for these models? (as you did with BERT)

Thank you! Cheers