Closed RishabhMaheshwary closed 4 years ago
This depends on the dataset, for example, for MR we can use 128 however for IMDB we should use 256 since its average sentence length is more than 128.
For Fake news you are using 256 or more ? And why have you kept the semantic similarity score different for datasets and also different across models ?
For Fake news I used 512. And in terms of semantic similarity, I used my local execution file instead of using the default values in the python file.
Sorry, what do you mean by local execution file ? I am not able to get it.
The execution files mean those files to run the python file with the hyperparameters settings, such as the "run_attack_classification.py" in this repo.
Ok, Actually I was asking why did you keep the semantic similarity different for datasets and also different across models ? Like for IMDB on BERT it is 0.86, for yelp on BERT it is 0.74 for yelp on wordLSTM it is 0.79 and so on. I cannot find an explanation about this in the paper.
I see, that similarity number is the average similarity scores between output adversary examples and original sentences. I can only control the threshold of similarity in the algorithm but cannot control the resulted similarity.
Ok, makes sense. Have you got the pre trained language parameters for wordCNN and wordLSTM models like you shared for BERT ?
Thanks!
I may have them for some dataset stored in my servers. Which one do you need?
If possible both for WordLSTM and wordCNN.
Thanks :)
Which datasets?
All datasets of classifiction and inference
Do you have the pretrained models ?
Or, if I have to train them than the vocab.txt per dataset will remain the same as you gave in the Google drive link ?
hi, I just now uploaded the pretrained model parameters for LSTM and CNN and the links are in the ReadMe. Please find them there. Let me know if you have any more questions.
What was your max_sequence_length value in the orignal experiments, the results of which you reported in the paper ? The default is set to 128.