albertomontesg / keras-model-zoo

Keras Model Zoo
208 stars 28 forks source link

Dimensionality of c3d means #16

Closed rohit-gupta closed 7 years ago

rohit-gupta commented 7 years ago

The input for the c3d model is of dimensions 3x16x112x112 (channels x timesteps x width x height), however the c3d_means file has a numpy array of size (3, 16, 128, 171).

I was wondering if this means the video has to be resized ? What should be done if I am using C3D as a feature extractor on a different dataset ?

rohit-gupta commented 7 years ago

Nevermind, found this in the paper.

All video frames are resized into 128×171. This is roughly half resolution of the UCF101 frames. Videos are split into non-overlapped 16-frame clips which are then used as input to the networks. The input dimensions are 3×16×128×171. We also use jittering by using random crops with a size of 3×16×112×112 of the input clips during training.