google-deepmind / kinetics-i3d

Convolutional neural network model for video classification trained on the Kinetics dataset.
Apache License 2.0
1.74k stars 461 forks source link

extract features #66

Open 123liluky opened 5 years ago

123liluky commented 5 years ago

If n_frames represents the frames of a video, How can I extract features of size n_frames*1024 with rgb_videos as the input?

joaoluiscarreira commented 5 years ago

The features are computed with a stride. The deeper the layer the bigger that stride is. One possibility is to combine features from different layers by upsampling the deeper ones (skip connections). You can also interpolate features between frames.

Joao

On Wed, Jun 26, 2019, 8:10 AM 123liluky notifications@github.com wrote:

If n_frames represents the frames of a video, How can I extract features of size n_frames*1024 with rgb_videos as the input?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/deepmind/kinetics-i3d/issues/66?email_source=notifications&email_token=ADXKU2SMO7BFWBDCQ4TF3STP4MI6VA5CNFSM4H3PE662YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G3XHY7Q, or mute the thread https://github.com/notifications/unsubscribe-auth/ADXKU2ULCTJNCHYB5GUBU53P4MI6VANCNFSM4H3PE66Q .