FCSGG

A PyTorch implementation for the paper:

Fully Convolutional Scene Graph Generation [paper], CVPR 2021

Hengyue Liu^*, Ning Yan, Masood Mortazavi, Bir Bhanu.

^* Work done in part as an intern at Futurewei Technologies Inc.

Installation

The project is built upon Detectron2. We incorporate Detectron2 as the submodule for easy use.

Requirements

# clone this repo
git clone url fcsgg
cd fcsgg

# init and pull the submodules
git submodule init 
git submodule update
pip install -r requirements.txt

Linux or macOS with Python ≥ 3.6
PyTorch ≥ 1.4 and torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this
OpenCV is optional and needed by demo and visualization

Docker

If use docker, one can pull the latest pytorch image by

docker pull pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel

then mount the repo and run the container like

docker run --gpus all --rm -ti --ipc=host --network=host --name="fcsgg" -v ~/fcsgg:/workspace/fcsgg pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel

Maybe need extra libraries

apt install libgl1-mesa-glx

apt-get install libglib2.0-0

Build Detectron2

# Detectron2 installation, or follow their instructions
python -m pip install -e detectron2

Dataset Preparation

Download the VG images part1 (9 Gb) part2 (5 Gb). Extract these images to the file datasets/vg/VG_100K.

wget https://cs.stanford.edu/people/rak248/VG_100K_2/images.zip -P datasets/vg/
wget https://cs.stanford.edu/people/rak248/VG_100K_2/images2.zip -P datasets/vg/
unzip -j datasets/vg/images.zip -d datasets/vg/VG_100K
unzip -j datasets/vg/images2.zip -d datasets/vg/VG_100K

Optionally, remove the .zip files.

rm datasets/vg/images.zip
rm datasets/vg/images2.zip

Download the scene graphs and extract them to datasets/vg/VG-SGG-with-attri.h5.

If use other paths, one may need to modify the related paths in file visual_genome.py.

The correct structure of files should be

fcsgg/
  |-- datasets/
     |-- vg/
        |-- VG-SGG-with-attri.h5         # `roidb_file`, HDF5 containing the GT boxes, classes, and relationships
        |-- VG-SGG-dicts-with-attri.json # `dict_file`, JSON Contains mapping of classes/relationships to words
        |-- image_data.json              # `image_file`, HDF5 containing image filenames
        |-- VG_100K                      # `img_dir`, contains all the images
           |-- 1.jpg
           |-- 2.jpg
           |-- ...

Getting Started

For getting familiar with Detectron2, one can find useful material from Detectron2 Doc.

The program entry point is in tools/train_net.py. For a more instructional walk-through of the logic, I wrote a simple script in tools/net_logic.py, and for understanding of the Visual Genome dataset and dataloader, you can find some visualizations in data_visualization.ipynb.

A minimum training example:

python tools/train_net.py \ 
--num-gpus 1 \
--config-file configs/quick_schedules/Quick-FCSGG-HRNet-W32.yaml

Detectron2 provides a key-value based config system that can be used to obtain standard, common behaviors. Common args can be found by running python tools/train_net.py -h. --config-file is required, other args like --resume(resume from the latest checkpoint) and --eval-only(only execute evaluation codes) are useful too.

By adding config arguments after --config-file, command line config overwrites the config in default .yaml file. For example, MODEL.WEIGHTS ckpt_path changes the checkpoint path.

A minimum evaluation example:

python tools/train_net.py \ 
--num-gpus 1 \
--eval-only \
--config-file configs/quick_schedules/Quick-FCSGG-HRNet-W32.yaml\
MODEL.WEIGHTS "put your .pth checkpoint path here"

Benchmark

Model	Checkpoint	Config
HRNetW32-1S	download	yaml
ResNet50-4S-FPN×2	download	yaml
HRNetW48-5S-FPN×2	download	yaml

Citation

If you find our code or method helpful, please use the following BibTex entry.

@inproceedings{liu2021fully,
  title={Fully Convolutional Scene Graph Generation},
  author={Liu, Hengyue and Yan, Ning and Mortazavi, Masood and Bhanu, Bir},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={11546--11556},
  year={2021}
}