mzolfaghari / ECO-efficient-video-understanding

Code and models of paper " ECO: Efficient Convolutional Network for Online Video Understanding", ECCV 2018
MIT License
437 stars 96 forks source link

About the training details #38

Closed HuaZheLei closed 5 years ago

HuaZheLei commented 5 years ago

@mzolfaghari Thank you for your excellent work. Your paper mentioned that

We initialize the weights of the 2D-Net weights with the BN-Inception architecture [31] pre-trained on Kinetics, as provided by [33].

I wonder how you train a 2D-Net on Kinetics.

mzolfaghari commented 5 years ago

@HuaZheLei it's almost similar to the 3D network but we will have score fusion at the final layer to get a single score for each video. To check the details of our 2D network architecture please have look on BN-inception network definition.