maximilianmozes / fgws

Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples (EACL 2021)
8 stars 1 forks source link

How is the clean & adversarial testset configured? #1

Closed bangawayoo closed 2 years ago

bangawayoo commented 2 years ago

Hi Maximilian, thank you for your inspiring work.

I had a minor question regarding how the clean and adversarial test set was configured. From my understanding, it seems like you sampled a subset of the testset (e.g. 2000 samples). Then made adversarial samples from the subset. I am wondering if the clean samples came from the same subset or a different randomly sampled subset?

From detect.py Line 73~ for idx, (adv_zipped, orig_cl, orig_text) in enumerate( zip(adv_examples, attack_pols, attack_sequences) ):

it seems that you are looping through adversarial examples and the original examples. Is adv_zipped the attacked version of orig_text? I am wondering if adv_zipped["clean"] is equivalent to orig_text.

Thanks again!

maximilianmozes commented 2 years ago

Hi, yes the clean and adversarial samples come from the same subset and adv_zipped['clean'] and orig_text should be the same.

bangawayoo commented 2 years ago

Thanks for the reply! I verified that they are the same. I have another question regarding memory requirement, but I will close this issue and open another one.