This is the official implementation of AIFormer (ICCV 2023)
Understanding the emotions in text and presenting them visually is a very challenging problem that requires a deep understanding of natural language and high-quality image synthesis simultaneously. In this work, we propose Affective Image Filter (AIF), a novel model that is able to understand the visually-abstract emotions from the text and reflect them to visually-concrete images with appropriate colors and textures. We build our model based on the multi-modal transformer architecture, which unifies both images and texts into tokens and encodes the emotional prior knowledge. Various loss functions are proposed to understand complex emotions and produce appropriate visualization. In addition, we collect and contribute a new dataset with abundant aesthetic images and emotional texts for training and evaluating the AIF model. We carefully design four quantitative metrics and conduct a user study to comprehensively evaluate the performance, which demonstrates our AIF modeloutperforms state-of-the-art methods and could evoke specific emotional responses from human observers.
Clone this repo:
https://github.com/zpx0922/AIFormer.git
Install PyTorch and dependencies
http://pytorch.org
Install other python requirements
pip install -r requirement.txt
Download the content(COCO2014) datasets.
Download the style (style image) datasets.
Download the description (Affective description) datasets.
Download the affective prior (VAD dictionary) datasets.
Pretrained models: vgg, embedding, decoder, Transformer, VAD_emb
For a glance of the performance of the AIF model, run the testing codes below.
python test.py --content_dir content_pic --description_dir utterance.txt --output <Path_to_Output> --vgg <Path_to_VGG> --decoder <Path_to_decoder> --Trans <Path_to_transformer> --embedding <Path_to_embedding> --VAD_emb <Path_to_VAD_emb> --VAD_dic <Path_to_VAD_dictionary>
You can place the content image below content_pic and modify the text description in utterance.txt.
Pretraining of Sentiment Vector (SV) models and emotion classification models.
Pretrained models: Sentiment Vector, emotion classification
Use the following codes for training:
python train.py --content_dir <Path_to_COCO2014> --style_dir <Path_to_WIKIART> --affective_ArtEmis <Path_to_Affective_description> --VAD_csv <Path_to_VAD_dictionary> --vgg <Path_to_VGG> --SV <Path_to_Sentiment_Vector> --label <Path_to_emotion_classification> --save_dir <Path_to_save_dir> --log_dir <Path_to_log_dir>