Closed my91porn closed 3 years ago
Hi,
Great catch! Indeed, I missed this when I was writing the README. Now I updated it where backprop=True
is set only for grad2
, and I also added the following comment:
Note that we can use
backprop=True
on both gradientsgrad1
andgrad2
but, based on our experiments, this doesn't make a substantial difference. Thus, to save computations, one can just usebackprop=True
on one of the two gradients.
So when we were writing the paper we indeed checked whether using backprop on both gradients has some influence and actually not much. We were getting nearly identical results but just the training was a bit slower. So we decided to go with backprop=True
only for the second gradient, i.e. as shown in train.py
. But I forgot this detail when I was writing the README. So thanks a lot for catching this!
Best, Maksym
Hi! This is a nice work, however, there is a point make me confused. In your readme, you say gradalign reg is following:
where grad1 and grad2 are all record grad. However in your train.py , you used detach which will not record grad, I don't know which one is better? https://github.com/tml-epfl/understanding-fast-adv-training/blob/65fda9b02dc5e25b374a47b532cddd6e0829e4b0/train.py#L162