yuanyao366 / PRP

Apache License 2.0
40 stars 10 forks source link

decoder structure of C3D, R3D, R21D #7

Open hw-liang opened 3 years ago

hw-liang commented 3 years ago

Based on your repo, it seems that you are using the same decoder structure for all of the backbones(c3d, r3d, r(2+1)d). But in your paper, it seems you used different decoder structure based on C3D-block, R2D-block and R21D-block.

We cannot reproduce the result reported in the paper base on current code. Could you also provide your decoder implementation of r3d, r(2+1)d?

AKASH2907 commented 3 years ago

In network architecture, they have mentioned that their decoder is the same.

" In video generation, four deconvolutional layers are stacked and followed by C3D blocks. To generate a video which is r times as slow as the input video, we set the 4-th deconvolutional layer with a stride of r × 2 × 2, where the reconstructing rate r is determined through ablation study."

I haven't tried to reproduce the exact results, but just to clear your doubt. Maybe the original authors could shed more light on how to reproduce the exact results.

hw-liang commented 3 years ago

In network architecture, they have mentioned that their decoder is the same.

" In video generation, four deconvolutional layers are stacked and followed by C3D blocks. To generate a video which is r times as slow as the input video, we set the 4-th deconvolutional layer with a stride of r × 2 × 2, where the reconstructing rate r is determined through ablation study."

I haven't tried to reproduce the exact results, but just to clear your doubt. Maybe the original authors could shed more light on how to reproduce the exact results.

Oh, I missed this part. Thank you very much! They provided the pretrained checkpoint and I try to reproduce UCF classification result with C3D model using the original code and hyper-setting, but there is 5% gap.

yuanyao366 commented 3 years ago

The decoder structure of C3D, R3D, R21D is same.