If you like our work, and want to start your own scene graph generation project, you might be interested in our new SGG codebase: Scene-Graph-Benchmark.pytorch. It's much easier to follow, and provides state-of-the-art baseline models.
Code for the Scene Graph Generation part of CVPR 2019 oral paper: "Learning to Compose Dynamic Tree Structures for Visual Contexts", as to the VQA part of this paper, please refer to KaihuaTang/VCTree-Visual-Question-Answering
UGLY CODE WARNING! UGLY CODE WARNING! UGLY CODE WARNING!
The code is directly modified from the project rowanz/neural-motifs. Most of the Codes about the proposed VCTree are located at lib/tree_lstm/*, and if you get any problem that cause you unable to run the project, you can check the issues under rowanz/neural-motifs first.
If my open source projects have inspired you, giving me some sponsorship will be a great help to my subsequent open source work. Support my subsequent open source work❤️🙏
Install Anaconda
conda update -n base conda
conda create -n motif pip python=3.6
conda install pytorch=0.3 torchvision cuda90 -c pytorch
bash install_package.sh
Please follow the Instruction under ./data/stanford_filtered/ to download the dateset and put them under proper locations.
Update the config file with the dataset paths. Specifically:
export PYTHONPATH=/home/YourName/ThePathOfYourProject
Compile everything. run make
in the main directory: this compiles the Bilinear Interpolation operation for the RoIs.
Pretrain VG detection. The old version involved pretraining COCO as well, but we got rid of that for simplicity. Run ./scripts/pretrain_detector.sh Note: You might have to modify the learning rate and batch size, particularly if you don't have 3 Titan X GPUs (which is what I used). You can also download the pretrained detector checkpoint here. Note that, this detector model is the default initialization of all VCTree models, so when you download this checkpoint, you need to change the "-ckpt THE_PATH_OF_INITIAL_CHECKPOINT_MODEL" under ./scripts/train_vctreenet
Note that, most of the parameters are under config.py. The training stages and settings are manipulated through ./scripts/train_vctreenet.sh Each line of command in train_vctreenet.sh needs to manually indicate "-ckpt" model (initial parameters) and "-save_dir" the path to save model. Since we have hybrid learning strategy, each task predcls/sgcls/sgdet will have two options for supervised stage and reinformence finetuning stage, respectively. When iteratively switch the stages, the -ckpt PATH should start with previous -save_dir PATH. The first supervised stage will init with detector checkpoint as mentioned above.
Train VG predicate classification (predcls)
Train VG scene graph classification (sgcls)
Train VG scene graph detection (sgdet)
Evaluate predicate classification (predcls):
Evaluate scene graph classification (sgcls):
Evaluate scene graph detection (sgdet):
@inproceedings{tang2018learning,
title={Learning to Compose Dynamic Tree Structures for Visual Contexts},
author={Tang, Kaihua and Zhang, Hanwang and Wu, Baoyuan and Luo, Wenhan and Liu, Wei},
booktitle= "Conference on Computer Vision and Pattern Recognition",
year={2019}
}