Two different checkpoints for each ViT type

facebookresearch / mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

Other

6.93k stars 1.17k forks source link

Two different checkpoints for each ViT type #191

Closed hussein-jafarinia closed 2 months ago

hussein-jafarinia commented 4 months ago

There are two different checkpoints for each vit in your source code. One group is checkpoints that only include encdoer (e.g. https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_base.pth) and the other group is checkpoints that include both encoder and decoder which we can find their linux in the demo code (e.g. https://dl.fbaipublicfiles.com/mae/visualize/mae_visualize_vit_large.pth). why aren't they weights in them isn't the same? What is the difference for these two (for example in training parameters)? which group is better?

kevin-Abbring commented 2 months ago

I'm curious, too. Going to test it.

Valdiolus commented 2 months ago

Found this

Here we use --norm_pix_loss as the target for better representation learning. To train a baseline model (e.g., for visualization), use pixel-based construction and turn off --norm_pix_loss.

TheodoreBagwe11 commented 2 months ago

I can only get checkpoints including encoder(mae_pretrain_vit_xx.pth), HOW to get a checkpoint including encoder and decoder (mae_visualize_vit_xx.pth) in my own model ?

hussein-jafarinia commented 2 months ago

I can only get checkpoints including encoder(mae_pretrain_vit_xx.pth), HOW to get a checkpoint including encoder and decoder (mae_visualize_vit_xx.pth) in my own model ?

You can find the answer in other issues. As much as I remember you should put "_full" at the end of name before ".pth".

hussein-jafarinia commented 2 months ago

This problem is answered in other issues completely. so I close it.