techmn / satmae_pp

Official repository for "Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery" (CVPR 2024)
Apache License 2.0
89 stars 5 forks source link

Questions about reconstruction loss #3

Closed misoyuri closed 3 months ago

misoyuri commented 3 months ago

I thank you for sharing your interesting paper.

I understand that only the reconstructed image patches are used for loss calculation in MAE. However, when I checked the paper and the code in this repo, I noticed that the loss for higher-scale images involves using all image pixels, not just the reconstructed image patch regions. Q1. Did you make any ablation studies regarding this?

Additionally, the paper mentions applying L1 loss to higher scales, similar to super-resolution, during multi-scale image reconstruction (just above Equation 4 of the paper). Q2. Could you share any references related to this part?

techmn commented 3 months ago

Thanks for pointing out the issue. Q1. you can rescale the mask the and use it for loss calculation. It should not affect the performance. For reference code, please see here: rescaling mask

Q2. please see ScaleMAE paper, here is the link scalemae