ColumbiaDVMM / CDC

CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos
68 stars 18 forks source link

Some Confusion on CDC Fine-tuning Steps #9

Closed 77QIQI closed 7 years ago

77QIQI commented 7 years ago

Hi Mr. Shou,

I have some following questions during the process of training:

  1. Are these the right steps in the following? Because I try to follow the below method to fine-tuning my own model but cannot get the right results.
  2. How do you deal with some short training videos which don’t have 32 frames? In gen_test_bin_and _list.py, if a video’s frames are fewer than 32, it will be error. I am not sure if it will produce some error if I just ignore these short videos.

Thanks in advance.

==================================================================

Step1 Prepare pre-trained model

cd THUMOS14/training/init/ sh run_net_surgey_sports1m_convdeconv.sh

Step2 prepare your own training data

exact frames from UCF101 (25 fps) generate the bin files and the list file for the test set with gen_test_bin_and_list.py as following:

/home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c01/000001.bin /home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c01/000033.bin /home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c01/000065.bin /home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c01/000097.bin /home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c01/000129.bin /home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c01/000134.bin /home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c02/000001.bin /home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c02/000033.bin /home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c02/000065.bin /home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c02/000093.bin /home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c03/000001.bin /home/qiqi/cdc/THUMOS14/training/window/v_ApplyEyeMakeup_g01_c03/000033.bin .....

Step3 train

sh finetuning.sh and get the convdeconv-TH14_iter_24390 in the folder /snapshot

minhtriet commented 7 years ago

I am also curious how man can change the window size. Some action happens fast, especially in sport, so a 32-frame-window is still too large.

zhengshou commented 7 years ago

Following conventional settings, I sample frames at around 25~30 FPS during frame extraction so that most of video samples are longer than 1s and thus has at least one window. For these video samples happen to be shorter than 1 window, I guess it won't affect results significantly. If you still concern, probably can pad with blank frames.

buaa-luzhi commented 7 years ago

Hi, @77QIQI I am sorry to trouble you. How to prepare your own training data? Do you have background or ambigious classes? Thank you very much!

77QIQI commented 7 years ago

@buaa-luzhi I am sorry to reply now. I didn't succeed to train with a right results. So I am sorry that I can't help you.