Video-MAC / VideoMAC

Official code for CVPR2024 “VideoMAC: Video Masked Autoencoders Meet ConvNets”
https://arxiv.org/abs/2402.19082
MIT License
9 stars 1 forks source link

How to apply it to other downstream tasks: #2

Open asher-bit opened 4 months ago

asher-bit commented 4 months ago

Hello, thank you very much for your work. I would like to try applying this network to other downstream tasks. Do I need to retrain the network? Could you please provide the pre-trained network model? During inference, do I only need to use the target encoder and connect it to the corresponding task decoder? Thank you very much!

PGSmall commented 4 months ago

Please refer to videomac.

asher-bit commented 4 months ago

Thank you for your response. I apologize as I am a beginner in this area. Can I understand it as you applied MASK to the input during the pre-training phase to obtain a pre-trained model, and during the inference phase, you can directly use this pre-trained model to obtain feature vectors?

PGSmall commented 4 months ago

That's right. The inference phase only requires the features of the video frames to be extracted using the VideoMAC model, and then the downstream tasks, such as VOS, can be performed using the label propagation method (CRW).

asher-bit commented 4 months ago

I understand, thank you for your response!