Adversarial examples provided in google drive are not able to fool your trained target model (BERT) itself

jind11 / TextFooler

A Model for Natural Language Attack on Text Classification and Inference

MIT License

485 stars 79 forks source link

Adversarial examples provided in google drive are not able to fool your trained target model (BERT) itself #23

Closed SachJbp closed 4 years ago

SachJbp commented 4 years ago

Adversarial examples provided in google drive are not able to fool your trained target model (BERT) itself. As of now I checked 'mr_bert.txt' and most of the claimed adversarial texts do not have the same label as claimed in the file. Please address the same.

A few examples that I could quote are : orig sent (0): davis is so enamored of her own creation that she ca n't see how insufferable the character is adv sent (1): davis is well enthralled of her own creation that she ca n't see how insufferable the character is

orig sent (0): it 's hard to imagine that even very small children will be impressed by this tired retread adv sent (1): it 's intense to thinking that even immeasurably small children will is impressed by this tired retread

jind11 commented 4 years ago

well, it may be that the adversarial examples I found in my local machine are crafted to a different BERT model with another random seed. I can help with re-generate the adversary examples. Do you only need mr_bert.txt or something else besides it?

SachJbp commented 4 years ago

It would be great if you could please assist me by uploading the trained BERT model parameters that you claim to have used for generating the adversarial examples in mr_bert.txt that is on google drive.

PS: I had currently used the trained BERT parameters using the link that you had mentioned in readme for replication of results which showed mismatch in claimed results.

jind11 commented 4 years ago

I just now updated the mr_bert file and this one should be ok now: https://drive.google.com/drive/folders/12yeqcqZiEWuncC5zhSUmKBC3GLFiCEaN?usp=sharing.

SachJbp commented 4 years ago

Thanks for that. Could you please report the new after attack accuracy?

jind11 commented 4 years ago

For target model bert: original accuracy: 86.000%, adv accuracy: 7.600%, avg changed rate: 15.618%, num of queries: 143.3

SachJbp commented 4 years ago

Just to confirm , this is using the same model whose trained parameters google drive link is there in your readme ? if not I request you to please update the model there too

jind11 commented 4 years ago

I think it should be the same model.

SachJbp commented 4 years ago

Okay, Thanks.