Closed leexinhao closed 1 year ago
65.748 is not the single-view result, it is the average precision of all clips (when testing, each video has 5x3 views). In validation, we always take the middle clip of the video, so the accuracy(67.0) is higher.
The inconsistent results when reloading ckpt are due to the run-time parameter, such as running_mean
& running_var
in normalization layers.
During training, each model on every GPU has its own run-time parameters. However, we only save the checkpoint on GPU0. When reloading the model, the run-time parameters of the model on other GPUs are loaded from the model on GPU0, leading to slight differences in the results.
consistent results when reloading ckpt are due to the run-time parameter, such as
running_mean
&running_var
in normalization layers.During training, each model on every GPU has its own run-time parameters. However, we only save the checkpoint on GPU0. When reloading the model, the run-time parameters of the model on other GPUs are loaded from the model on GPU0, leading to slight differences in
As far as I know, VideoMAE has no BatchNorm so we needn't synchronize the running_mean
& running_var
. Are there any other parameters that need to be synchronized? And can we avoid this problem?
Another possibility is inconsistent batch sizes. When the testing data cannot be evenly divided by the batch size, the last batch will randomly select some videos to fill in.
I can't find that in your code. Can you show me the location of the corresponding implementation?
This code is modified from DeiT, and I haven't looked closely at how it's handled, but it should be related to the sampler.
I see, thank you!
I did some simple finetuning training and it seems that some of it looks normal :
But when I retested the saved pt file with the '--eval' parameter, I got slightly different results: in particular, the results of the single view test were quite different (65.xx vs. 67.xx):
Is this normal or a bug? Is there something wrong with my understanding?