ZhaofanQiu / pseudo-3d-residual-networks

Pseudo-3D Convolutional Residual Networks for Video Representation Learning
MIT License
352 stars 119 forks source link

leave_one_out #6

Open 3DMM-ICME2023 opened 6 years ago

3DMM-ICME2023 commented 6 years ago

Hi,@ZhaofanQiu,thanks for your nice job!

How do you implement the leave-one-video-out algorithm for the Dynamic Scene dataset? This dataset has 130 videos(13 class,each class 10 videos), Just train 130 models with the libsvm ?

ZhaofanQiu commented 6 years ago

Yes, we train 130 linear SVMs by leaving each video as the validation set.

3DMM-ICME2023 commented 6 years ago

Thanks for your kind help!

3DMM-ICME2023 commented 6 years ago

Sorry for disturbing you again! I can not reproduce the result reported in your paper. Could you please tell me more details about training the svm models on the Dynamic Scene dataset? All the models are trained with the liblinear tools and default parameters? If not default, could you show me the parameters you used ? All models used the same parameter? Thank you very much!

ZhaofanQiu commented 6 years ago

In this experiment, I use the libsvm toolbox with linear kernel to train SVM. I think there are some key points related to the performance:

  1. The video resolutions in Dynamic Scene dataset are not fixed. When extracting the P3D representation, we resize the short edge of video frame to 160 while keeping the aspect ratio. Then crop the center 160*160 region.
  2. Mirror augmentation is added, which means for each clip, we average the two representations (no-mirror & mirror).
  3. Before SVM training, we do signed square-root and L2 normalization in turn.
  4. We fix the hyper-parameter C for all SVMs as 100. Each of these step may influence the final performance. But these are the common settings, we do not change these setting for different tasks. Best, Zhaofan