facebookresearch / VMZ

VMZ: Model Zoo for Video Modeling
Apache License 2.0
1.04k stars 157 forks source link

How to finetune the optical model on my own dataset? #22

Closed wenjie710 closed 6 years ago

wenjie710 commented 6 years ago

Hi @dutran,

Thanks a lot for your answer to my last several questions. Now I am trying to finetune the pre-trained optical model on my own datasets, and I get another two questions:

1.how to generate optical flow from raw rgb videos? 2.how to finetune the optical model on the optical flow images generated from my datasets?

Could you please give me some advice or release the code for it?

murilovarges commented 6 years ago

Hi @wenjie710,

  1. You don't need to convert raw RGB videos to optical flow, the caffe2 framework does this automatically.
  2. Se this thread https://github.com/facebookresearch/R2Plus1D/issues/7, for details of how finetune OF model.
dutran commented 6 years ago

Thank @murilovarges for your helps. In fact, VideoInputOp (https://github.com/pytorch/pytorch/tree/master/caffe2/video) allows to compute optical flow on-the-fly. You can adjust the fine-tuning script (https://github.com/facebookresearch/R2Plus1D/blob/master/scripts/finetune_hmdb51.sh) with extra-parameters as used in (https://github.com/facebookresearch/R2Plus1D/blob/master/scripts/extract_feature_kinetics_of.sh). They are:

--clip_length_rgb=33 --sampling_rate_rgb=1 \
--clip_length_of=32 --sampling_rate_of=1 \
--flow_data_type=0 --frame_gap_of=1 --do_flow_aggregation=1 \
--num_channels=2 --input_type=1 \

Note that clip_length_rgb is always clip_length_of + 1, so if your model needs output 8 OF frames then clip_length_rgb = 9, clip_length_of = 8