google-deepmind / tapnet

Tracking Any Point (TAP)
https://deepmind-tapir.github.io/blogpost.html
Apache License 2.0
1.28k stars 120 forks source link

TAPIR loss computation clarification #100

Closed shivanimall closed 3 weeks ago

shivanimall commented 3 months ago

Hello,

I saw in a comment in tapir_model.py code that says: At training time there's only one iteration, so we'll just get the final refined output. However, in the paper, in section 3.2 it says At each iteration, we use the same loss L on the output, weighting each iteration the same as the initialization. I wanted to clarify whether or not the loss is computed per refinement iteration (i.e, 5 (num_pips_iters: 4 + init: 1))?

Thank you!

shivanimall commented 3 months ago

also, I understand that TAPIR is a test-time adaptation / refinement method, so is the loss referred in sec 3.2 (noted above) applied at test-time?

cdoersch commented 3 months ago

At training time there's only one iteration, so we'll just get the final refined output.

This appears to be a typo. There's only one resolution (i.e. training is done at 256x256), but it still uses 4 refinement iterations at training time. Therefore the loss is computed 5 times in training.

The loss is not applied at test time, as there's no ground truth but the loss requires ground truth.

shivanimall commented 3 months ago

thank you for clarifying.

I wanted to also clarify whether loss is applied for all the refinement iterations together? It seems that the refined values from all iterations are returned jointly from the forward call; then, the loss is applied, and not separately per refinement iteration.

Please let me know if I am unclear or missed something, thanks.

cdoersch commented 2 months ago

I don't understand the distinction. The model returns the predictions for every layer, and the loss is applied to all of them.