The Number of Forward Pass

kinredon commented 3 years ago

As the paper said, TNET needs 2× the inference time plus 1× the gradient time per test point, but I found there is only one forward and one gradient update in the code https://github.com/DequanWang/tent/blob/03ac55c4fef0fda3eacb2f6414b099031e96d003/tent.py#L49

Is the right order: forward, backward, and forward?

AI678 commented 3 years ago

It seems that the performance is insenstitive to this extra forward pass

kinredon commented 3 years ago

In my view, the model BN layer parameters should be updated first using entropy minimization, and then making predictions will achieve better performance. However, this implement utilizing the output of the first forward makes me confused.

shelhamer commented 3 years ago

Please see the published edition of the paper at ICLR'21, where we have updated the method regarding the number of forward passes:

In further experiments we found that the results are insensitive to re-forwarding after the update. In practice, tent often only requires a few updates to adapt to the shifts in our experiments, and so repeating inference is not necessary. The update on the last batch still improves prediction on the next batch. Note that this shows the adaptation learned by tent generalizes across target points, as it makes the prediction before taking the gradient, and so its improvement is not specific to each test batch (see this review comment for more discussion).

Is the right order: forward, backward, and forward?

If you want to include the final forward, to have the most-up-to-date predictions with respect to entropy minimization, then you can simply add outputs = self.model(x) after the forward and adapt loop: https://github.com/DequanWang/tent/blob/master/tent.py#L30-L31

Thank you for your question about adaptation with and without repeating inference!

Jo-wang commented 2 years ago

Thank you! This helps me a lot.

DequanWang / tent

The Number of Forward Pass #3