924973292 / TOP-ReID

【AAAI2024】TOP-ReID: Multi-spectral Object Re-Identification with Token Permutation
MIT License
42 stars 2 forks source link

Hello, I would like to ask if you know why the accuracy plummets when the code loads mae pre-training weights. #4

Closed 1125178969 closed 5 months ago

1125178969 commented 5 months ago

The training start Loss is smaller than when loading imagenet pre-training weights, but the validation set accuracy is much lower than when loading imagenet pre-training weights

924973292 commented 5 months ago

Thank you for your attention! In fact, I have tried pre training weight loading for MAE before, and as you have obtained, the effect is worse than the pre training results for Imagenet. However, I am not sure to what extent you said it was very poor. In my previous impression, the weight loading results for the two pre training methods on RGBNT201 were 10 points different. I speculate that this result is due to the fact that pre training with MAE can only capture some structural information. Perhaps true labeled pre training can help the model learn higher-level decision information, which is a similar phenomenon in many tasks.

But perhaps your MAE pre training weights were not loaded correctly, and the learning rate under this pre training parameter may not have been adjusted. Besides, your observation is very detailed. I didn't notice that the initial loss of MAE was lower, but I think you need to conduct multiple experiments to see if the initial loss is indeed lower, as this may be related to many random fluctuations.

In addition, you can try the big dataset rgbnt100 to observe if similar situations occur, as rgbnt201 is too small and may have large fluctuations. If the initial loss is indeed lower, I guess it is because MAE imitated the key factor of occlusion, which is a common challenge in reid and thus helpful for the initial learning stage. However, the analysis of the final results is not yet clear. I hope the above answer can help you!

1125178969 commented 5 months ago

Thank you for your answer, it is beneficial and your work is useful for me to understand the multi-modal ReID