StyleNet: Generating Attractive Visual Captions with Styles

* under development

StyleNet is a novel framework to address the task of generating attractive captions for images and videos with different styles. A novel model component, named factored LSTM is used in StyleNet, which automatically distills the style factors in the monolingual text corpus.

framework Imgur

examples of generated captions Imgur

Description

A pytorch implemention of StyleNet
Author: Chuang Gan, Zhe Gan, Xiaodong He, Jianfeng Gao, Li Deng
Published in: Computer Vision and Pattern Recognition (CVPR), 2017
URL: https://www.microsoft.com/en-us/research/wp-content/uploads/2017/06/Generating-Attractive-Visual-Captions-with-Styles.pdf
Dataset: https://zhegan27.github.io/Paper.html
Slideshare: https://www.slideshare.net/DeepLearningJP2016/dl-hacks-stylenet-generating-attractive-visual-captions-with-styles
written by Kota Kakiuchi

Requirement

python 3.5.3
pytorch 0.2.0
torchvision 0.1.9
numpy 1.13.3
scikit-image 0.13.1
nltk 3.2.5

kacky24 / stylenet

readme

StyleNet: Generating Attractive Visual Captions with Styles

* under development

Description

Requirement