microsoft / UniVL

An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
https://arxiv.org/abs/2002.06353
MIT License
336 stars 54 forks source link

end-to-end video file captioning process #36

Closed mhyeonsoo closed 2 years ago

mhyeonsoo commented 2 years ago

Hi, thanks for the great sources. I successfully fine-tuned video captioning model for Youcook2, and now trying to input video file as a input and get the caption of it. But when I see the code, it seems like it always require pickle file or npy feature file. Is there a way to get the captioning script as an output with the custom video file input? Thanks,

ArrowLuo commented 2 years ago

Hi @mhyeonsoo, I think you could modify the data loader as your requirement referring to dataloader_youcook_caption.py. Best~

mhyeonsoo commented 2 years ago

Thanks, I will look into that :)

cws7777 commented 1 year ago

Hi, @mhyeonsoo have you figured out this? If you have, can you please share it?

Thanks!