Kamino666 / Video-Captioning-Transformer

这是一个基于Pytorch平台、Transformer框架实现的视频描述生成 (Video Captioning) 深度学习模型。 视频描述生成任务指的是:输入一个视频,输出一句描述整个视频内容的文字(前提是视频较短且可以用一句话来描述)。本repo主要目的是帮助视力障碍者欣赏网络视频、感知周围环境,促进“无障碍视频”的发展。
Apache License 2.0
79 stars 18 forks source link

How I can get the pre-trained model? #14

Closed zydjohnHotmail closed 2 years ago

zydjohnHotmail commented 2 years ago

Hello: Please remember that due to the Internet blocking in China, anything published in Baidu, people outside China can't use. Please consider to publish them in google. Thanks,

Kamino666 commented 2 years ago

Hello! I believe I did upload the model in Google Drive. Please check the tabular.

zydjohnHotmail commented 2 years ago

Hello: Thanks, I got the model. How many langauges this model can use to produce video captioning? English and Chinese?

Kamino666 commented 2 years ago

Only English for now. Because both the datasets do not support multi-language. I am planing to use another dataset(VATEX) but have difficulty downloading Youtube videos

zydjohnHotmail commented 2 years ago

Hello: You can try this url to download the dataset, but it may not work if you are in China. But I am not sure if it is what you need. https://www.kaggle.com/datasets/neverix/vatex-lowres/download

Kamino666 commented 2 years ago

Thank you for the link! It is what I am looking for. I didn't know it is available in Kaggle. I will try it later.

zydjohnHotmail commented 2 years ago

You are welcome. If you can make Chinese subtitle, let me know!