mynlp / cst_captioning

PyTorch Implementation of Consensus-based Sequence Training for Video Captioning
60 stars 17 forks source link

feature fusion? #12

Open upccpu opened 6 years ago

upccpu commented 6 years ago

Hello, I found one video only has one feature(C3D,Resnet),not all the fatures of frames we choosed. Could you tell me how to make them together?

plsang commented 6 years ago

Hello. We used average pooling to aggregate frame-level features into video-level features.