Closed jaypatravali closed 2 years ago
The pre-trained model weights are available in 8 and 32 frame versions. I don't know if fine tuning with 16 frames will work - give it a try and please post back your findings here, thanks! 🤗
On July 14, 2020 8:54:00 AM UTC, Jay Patravali notifications@github.com wrote:
Hi I wish to train the 34 layer R2plus1D using IG65 pretrained weights and finetune it on kinetics400 dataset. Is it okay to use 16 frame RGB inputs?
@jaypatravali should work relativelly well. Du had an ablation in his original r2+1d paper of models pre-trained on 8/32 and evaluated on 32 frames and the accuracy drop is negligable. See https://arxiv.org/pdf/1711.11248.pdf, table3
@daniel-j-h @bjuncek thanks for your prompt replies 😄 . I ll try out training 16 frame RGB video inputs using r2plus1d_34_32_IG65 pretrained weights. I ll let you know how it turns out
@jaypatravali hi, can you tell me the result? how's it going?
@jaypatravali Hi,It don't work well with my work. How about you? can you tell me the result?
Hi I am new to training video models. I have been reading papers which work on action recognition using new models like R3D, R2plus1D with 16 frame inputs. Is there a way to use the R2plus1D_34 using IG65 pretrained weights and finetune it on kinetics400 dataset.