fine-tuning AVA dataset for spatiotemporal detection

OpenGVLab / VideoMAEv2

[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

https://arxiv.org/abs/2303.16727

MIT License

527 stars 63 forks source link

fine-tuning AVA dataset for spatiotemporal detection #55

Open Young-eng opened 8 months ago

Young-eng commented 8 months ago

I wonder if the preparation of custom AVA-format dataset is the same as the VideoMAE, are the process of fine-tuning AVA format custom dataset the same as the process in VideoMAE-action-detection? Thanks.