ControlNet / MARLIN

[CVPR] MARLIN: Masked Autoencoder for facial video Representation LearnINg
https://openaccess.thecvf.com/content/CVPR2023/html/Cai_MARLIN_Masked_Autoencoder_for_Facial_Video_Representation_LearnINg_CVPR_2023_paper
Other
209 stars 20 forks source link

How to training VideoMAE models #10

Closed ByeongjunCho closed 3 months ago

ByeongjunCho commented 11 months ago

Hi, Thanks to share your good paper and implementation code.

I try VideoMAE model to compare performance between MARLIN and VideoMAE.

I find MARLIN and VideoMAE emotion recognition performance in Table3(Facial Expression and Sentiment Recognition) and Table7 in your paper.

How to training your own VideoMAE that writed in MARLIN paper? I want to know how to training VideoMAE(preprocessing, model, etc)

Sorry for my English skill.

Thank you

ControlNet commented 7 months ago

Hi, the main difference between MARLIN and VideoMAE is the adversarial loss and the masking strategy. Set mask_strategy to "tube" and adv_weight to 0 should be enough to train the VideoMAE.