bair-climate-initiative / scale-mae

Make your models invariant to changes in scale.
https://ai-climate.berkeley.edu/scale-mae-website/
Other
121 stars 13 forks source link

About the fMoW dataset and difference with SatMAE #8

Closed bourcierj closed 8 months ago

bourcierj commented 9 months ago

Hello,

I was trying to reproduce your results on fMoW, and noticed some differences with prior work and SatMAE in particular. I haven't found details about these differences in the paper or the repo.

I am hoping that you could clarify why you opted for such different choices, and how this has impacted the comparison with SatMAE results.

Thank you.

RitwikGupta commented 8 months ago

Hey @bourcierj!

Apologies for my delay in response as I have been on international travel. Good questions, here are the answers.

  1. To start with, please ignore the splits for FMoW in the splits directory. We do not use that split file. I will be removing them from the repo.
  2. We use the entire FMoW train and test split. In our code, we pass in the entire folder path and hence use the ImageFolder, not ImageList, implementation. The spurious split files are causing the confusion. https://github.com/bair-climate-initiative/scale-mae/blob/main/mae/config/fmow.yaml#L5
  3. We are not mixing the RGB and MSRGB images. We only use all of the RGB images. Again, this confusion is caused by spurious split files in the repo. https://github.com/bair-climate-initiative/scale-mae/blob/main/mae/dataloaders/fmow.py#L13
  4. The only pre-processing done by the FMoW baseline is the chipping of images and the conversion of metadata to feature vectors. Our code is equivalent to theirs in that we do not change the pixel values of the loaded images.

Our comparison to SatMAE is identical.

Thanks! Ritwik

RitwikGupta commented 8 months ago

https://github.com/bair-climate-initiative/scale-mae/commit/89280d830037ff27c20459cdab03e01e633e29bb

bourcierj commented 8 months ago

Thanks for the reply @RitwikGupta ! This explains why my run yields incomparable results (using the spurious split files for train and val, I got much worse linear probe performance than yours and than SatMAE's). Will retry with the right splits.