Closed misoyuri closed 3 months ago
Thanks for pointing out the issue. Q1. you can rescale the mask the and use it for loss calculation. It should not affect the performance. For reference code, please see here: rescaling mask
Q2. please see ScaleMAE paper, here is the link scalemae
I thank you for sharing your interesting paper.
I understand that only the reconstructed image patches are used for loss calculation in MAE. However, when I checked the paper and the code in this repo, I noticed that the loss for higher-scale images involves using all image pixels, not just the reconstructed image patch regions. Q1. Did you make any ablation studies regarding this?
Additionally, the paper mentions applying L1 loss to higher scales, similar to super-resolution, during multi-scale image reconstruction (just above Equation 4 of the paper). Q2. Could you share any references related to this part?