JimLee1996 / AVSS2019

Efficient Violence Detection Using 3D Convolutional Neural Networks
MIT License
15 stars 10 forks source link

What is the difference between the lean and the original version of the 3D DenseNet #4

Open bilel-bj opened 4 years ago

bilel-bj commented 4 years ago

Thanks for your work. I read the total paper but I did not find the difference between the lean and the original version of the 3D DenseNet

JimLee1996 commented 4 years ago

We cut off the last dense block as it does't contribute to feature learning in this task and cause severe overfitting. Sorry for not explaining it in paper.

JimLee1996 commented 4 years ago

Here is the reason:

As the input size of 3d Densenet is 3x16x112x112, different from that in imagenet Densenet 3x224x224, the input feature map of the 4th (i.e. the last) block is degenerated to scalar.

So we might not need to use extra nonlinear module which brings more computation and may cause more overfitting (slight overfitting does exist for lean model on mix dataset).

bilel-bj commented 4 years ago

Thanks a lot for your explanation. So, it is better to use the lean version?

JimLee1996 commented 4 years ago

Yep.