max-andr / relu_networks_overconfident

Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem [CVPR 2019, oral]
https://arxiv.org/abs/1812.05720
181 stars 21 forks source link

Some questions about data processing. #4

Closed jimo17 closed 1 year ago

jimo17 commented 2 years ago

Thanks for open source. I am very interested in your paper. I have a question. Why is line 144 in train.py is [x,x,x] instead of just one x?

https://github.com/max-andr/relu_networks_overconfident/blob/ce2d3a1ab8434cdb46a2d20da291411052474636/train.py#L144

max-andr commented 2 years ago

hi,

if i remember correctly, it's just needed to make sure that we can have more adversarial (adv) samples than the batch size (e.g., up to 3x more). note that we anyway subsample the generated data via [:n_adv]. so it's a very ad-hoc trick to make things work. definitely, a much cleaner solution must exist (and the factor of 3x is not somehow special; it's just "large enough" for practical values of n_adv that we were interested in).

best, maksym