Open sinkers-lan opened 4 months ago
Hi @sinkers-lan, thank you for catching this! This is indeed a logical error. We'll fix it in the next version that's coming up.
Hi @sinkers-lan, does full fine-tuning mean that training on new datasets which is on other field? I read your comments under another issue about prepare kubric training datasets. Could you please give me some advice about full fine-tuning for CoTraker? If I wanna apply CoTracker on my own datasets, is it necessary for me to train it on kubric? Or just start "full fine-tuning"? I apologize for causing you any inconvenience as a noob in deep learning.
I am currently performing full fine-tuning. When I attempt to adjust the
traj_per_sample
parameter from 768 to 384 during training, the average loss approximately doubles. When I adjust thetraj_per_sample
parameter from 768 to 256, the average loss increases by about three times.After observing this phenomenon, I carefully reviewed the code for the loss function and noticed that the loss is divided by N at the end:
This line of code can be found here.
Similarly,
This line of code can be found here.
I believe this might be the cause of the aforementioned increase in loss. This is because before dividing the loss by N, the
reduce_masked_mean
function already computes the mean across various dimensions. Dividing by N again leads to a larger loss when N is smaller.I think this might be a logical error in the code. Your guidance on this issue would be greatly appreciated.