Closed h-aboutalebi closed 2 years ago
Hi! I think I found where the issue is: I apply normalization to the inputs when I load the model (e.g., https://github.com/RobustBench/robustbench/blob/master/robustbench/model_zoo/architectures/xcit.py#L95), even though, when I trained the models, I applied no normalization. Let me fix and test this!
Meanwhile, if you want to quickly see how to reproduce the results without using load_model
from here, you can take a look at how I evaluated the model for the paper here
Hi @h-aboutalebi! This should be fixed on master now, with 513d60c. Now you should be able to reproduce the reported numbers with
python -m robustbench.eval --n_ex=5000 --dataset=imagenet --threat_model=Linf --model_name=Debenedetti2022Light_XCiT-S12 --data_dir=/data/imagenet --batch_size=128 --eps=0.0156862745
Of course as --model_name
you can also specify Debenedetti2022Light_XCiT-M12
or Debenedetti2022Light_XCiT-L12
.
Please, feel free to re-open the issue if this doesn't work!
Hi, @dedeswim thanks for your help. I still cannot replicate the results. Now the accuracy has dropped even more. Thanks
Hi @h-aboutalebi how are you trying to reproduce the results? Would you mind sharing your code if possible? Moreover, what are the exact numbers you are getting?
Hi @dedeswim, I cannot reproduce the results reported in the table for ImageNet with epsilon 4/255 for XCiT-S and XCiT-M12, XCiT-L12 . The accuracy I get for both clean and robust is much lower around 45% for robust and 50% for clean for all the three models. Can you please help me reproduce the results? Am I missing something here?