WingsBrokenAngel / delving-deeper-into-the-decoder-for-video-captioning

Source code for Delving Deeper into the Decoder for Video Captioning
MIT License
38 stars 14 forks source link

Inference on single video #5

Closed nikky4D closed 4 years ago

nikky4D commented 4 years ago

Hi,

do you have a demo.py/ipynb that I can use to run inference on a single video to see the captions generated? If not, can you describe how I can go about making this setup?

Thanks

WingsBrokenAngel commented 4 years ago
  1. Encoder part: use ResNeXt, ECO and Semantic Detection Network to extract features from a video clip.
  2. Decoder part: use those features as inputs to the captioning model.
amil-rp-work commented 4 years ago

@WingsBrokenAngel Can you please provide code/repo links on how to go about feature extraction for encoder part?

WingsBrokenAngel commented 4 years ago

@WingsBrokenAngel Can you please provide code/repo links on how to go about feature extraction for encoder part?

ResNeXt could be found in tensornets and ECO could be found in ECO-efficient-video-understanding.