Is fusion_encoder getting used for video captioning?

X-PLUG / mPLUG-2

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)

Apache License 2.0

213 stars 17 forks source link

Open Astuary opened 1 month ago

Astuary commented 1 month ago

Hello!

I was looking at model_video_caption_mplug.py and saw that self.fusion_encoder is not used in the forward pass.

Do we need to instantiate it for the video captioning tasks?