A PyTorch implementation for the paper:
Fully Convolutional Scene Graph Generation [paper], CVPR 2021
Hengyue Liu*, Ning Yan, Masood Mortazavi, Bir Bhanu.
* Work done in part as an intern at Futurewei Technologies Inc.
The project is built upon Detectron2. We incorporate Detectron2 as the submodule for easy use.
# clone this repo
git clone url fcsgg
cd fcsgg
# init and pull the submodules
git submodule init
git submodule update
pip install -r requirements.txt
If use docker, one can pull the latest pytorch image by
docker pull pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel
then mount the repo and run the container like
docker run --gpus all --rm -ti --ipc=host --network=host --name="fcsgg" -v ~/fcsgg:/workspace/fcsgg pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel
Maybe need extra libraries
apt install libgl1-mesa-glx
apt-get install libglib2.0-0
# Detectron2 installation, or follow their instructions
python -m pip install -e detectron2
datasets/vg/VG_100K
. wget https://cs.stanford.edu/people/rak248/VG_100K_2/images.zip -P datasets/vg/
wget https://cs.stanford.edu/people/rak248/VG_100K_2/images2.zip -P datasets/vg/
unzip -j datasets/vg/images.zip -d datasets/vg/VG_100K
unzip -j datasets/vg/images2.zip -d datasets/vg/VG_100K
Optionally, remove the .zip files.
rm datasets/vg/images.zip
rm datasets/vg/images2.zip
datasets/vg/VG-SGG-with-attri.h5
.If use other paths, one may need to modify the related paths in file visual_genome.py.
The correct structure of files should be
fcsgg/
|-- datasets/
|-- vg/
|-- VG-SGG-with-attri.h5 # `roidb_file`, HDF5 containing the GT boxes, classes, and relationships
|-- VG-SGG-dicts-with-attri.json # `dict_file`, JSON Contains mapping of classes/relationships to words
|-- image_data.json # `image_file`, HDF5 containing image filenames
|-- VG_100K # `img_dir`, contains all the images
|-- 1.jpg
|-- 2.jpg
|-- ...
For getting familiar with Detectron2, one can find useful material from Detectron2 Doc.
The program entry point is in tools/train_net.py
. For a more instructional walk-through of the logic, I wrote a simple script in tools/net_logic.py
, and for understanding of the Visual Genome dataset and dataloader, you can find some visualizations in data_visualization.ipynb.
A minimum training example:
python tools/train_net.py \
--num-gpus 1 \
--config-file configs/quick_schedules/Quick-FCSGG-HRNet-W32.yaml
Detectron2 provides a key-value based config system that can be used to obtain standard, common behaviors. Common args can be found by running python tools/train_net.py -h
. --config-file
is required, other args like --resume
(resume from the latest checkpoint) and --eval-only
(only execute evaluation codes) are useful too.
By adding config arguments after --config-file
, command line config overwrites the config in default .yaml
file. For example, MODEL.WEIGHTS ckpt_path
changes the checkpoint path.
A minimum evaluation example:
python tools/train_net.py \
--num-gpus 1 \
--eval-only \
--config-file configs/quick_schedules/Quick-FCSGG-HRNet-W32.yaml\
MODEL.WEIGHTS "put your .pth checkpoint path here"
Model | Checkpoint | Config |
---|---|---|
HRNetW32-1S | download | yaml |
ResNet50-4S-FPN×2 | download | yaml |
HRNetW48-5S-FPN×2 | download | yaml |
If you find our code or method helpful, please use the following BibTex entry.
@inproceedings{liu2021fully,
title={Fully Convolutional Scene Graph Generation},
author={Liu, Hengyue and Yan, Ning and Mortazavi, Masood and Bhanu, Bir},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={11546--11556},
year={2021}
}
We thanks these authors and implementations:
https://github.com/facebookresearch/detectron2
https://github.com/open-mmlab/mmdetection/blob/master
https://github.com/xingyizhou/CenterNet
https://github.com/FateScript/CenterNet-better