recommend installing pytorch and python packages using Anaconda
MSR-VTT. Test video doesn't have captions, so I spilit train-viedo to train/val/test. Extract and put them in ./data/
directory
all default options are defined in opt.py or corresponding code file, change them for your like.
Some code refers to ImageCaptioning.pytorch
you can use video-classification-3d-cnn-pytorch to extract features from video.
python prepro_feats.py --output_dir data/feats/resnet152 --model resnet152 --n_frame_steps 40 --gpu 4,5
python prepro_vocab.py
python train.py --gpu 0 --epochs 3001 --batch_size 300 --checkpoint_path data/save --feats_dir data/feats/resnet152 --model S2VTAttModel --with_c3d 1 --c3d_feats_dir data/feats/c3d_feats --dim_vid 4096
test
opt_info.json will be in same directory as saved model.
python eval.py --recover_opt data/save/opt_info.json --saved_model data/save/model_1000.pth --batch_size 100 --gpu 1
Some code refers to ImageCaptioning.pytorch