ksaito-ut/openworld_ldet

Learning to Detect Every Thing in an Open World

[Paper] | [Project Page] | [Demo Video]

If you use this code for your research, please cite:

Learning to Detect Every Thing in an Open World.

Kuniaki Saito, Ping Hu, Trevor Darrell, Kate Saenko. In Arxiv 2021. [Bibtex]

News

[2022/02/20] We find a bug in evaluation code and are rewriting the paper. Please contact the authors if you plan to cite our papers.

[2022/03/11] We fixed the bug in ldet/evaluation/coco_evaluation.py. The pascal class id was not indexed correctly. We will further update the repo following our updated paper.

[2022/04/11] Updated the evaluation code following new paper.

Installation

Requirements

Linux with Python >= 3.6
PyTorch
torchvision that matches the PyTorch installation
CUDA

Build LDET

Create a virtual environment. We used conda to create a new environment.
```
conda create --name ldet
conda activate ldet
```
Install PyTorch. You can choose the PyTorch and CUDA version according to your machine. Just make sure your PyTorch version matches the prebuilt Detectron2 version (next step). Example for PyTorch v1.10.0:
```
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
```
Currently, the codebase is compatible with Detectron2 v0.6. Example for PyTorch v1.10.0 and CUDA v11.3:

Install Detectron2 v0.6

python -m pip install detectron2 -f \
https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html

Install other requirements.

python3 -m pip install -r requirements.txt

Code Structure

configs: Configuration files
ldet
- data: Code related to dataset configuration.
- data/copy_paste_mapper.py: Code for our data augmentation.
- engine: Contains config file for training.
- evaluation: Code used for evaluation.
- modeling: Code for models, including backbones, prediction heads.
tools
- trainer_copypaste.py: Training and testing script.
- run_test.sh: Evaluation script
- run_train.sh: training script

Data Preparation

We provide evaluation on COCO, UVO, and Mapillary (v2.0) in this repository:

COCO. Trained on train split, evaluated on validation split. Download the COCO dataset following the instruction of detectron2.
UVO. We downloaded uvo_videos_sparse.zip and evaluated on the videos. Follow their instructions to split videos into frames. The json file split used for evaluation is available in Dropbox Link Update the line in builtin.py.

E.g., the data structure of UVO dataset is as follows:

uvo_frames_sparse/video1/0.png
uvo_frames_sparse/video1/1.png
.
.
.
uvo_frames_sparse/video2/0.png
.

Cityscapes. Follow detectron2's instruction. Update the line in builtin.py.
Mapillary. Update the line in builtin.py.
Obj365. We used validation set. We pick 5000 images for evaluation. The json is available here. Please rewrite the path of the lists following your environment.

Trained models

The trained weights are available from link attached with model.

Method	Training Dataset	Evaluation Dataset	box AP	box AR	seg AP	seg AR	Link
Mask RCNN	VOC-COCO	Non-VOC	1.5	10.9	0.7	9.1	model \| config
Mask RCNN^S	VOC-COCO	Non-VOC	3.4	18.0	2.2	15.8	model \| config
LDET	VOC-COCO	Non-VOC	5.0	30.8	4.7	27.4	model \| config
Mask RCNN^S	COCO	UVO	25.3	42.3	20.6	35.9	model \| config
LDET	COCO	UVO	25.8	47.5	21.9	40.7	model \| config

Training & Evaluation

Training

To train a model, run

## Training on VOC-COCO
sh tools/run_train.sh configs/VOC-COCO/voc_coco_mask_rcnn_R_50_FPN.yaml save_dir
## Training on COCO
sh tools/run_train.sh configs/COCO/mask_rcnn_R_50_FPN.yaml save_dir
## Training on Cityscapes
sh tools/run_train.sh configs/Cityscapes/mask_rcnn_R_50_FPN.yaml save_dir

Note that the training will produce two directories, i.e., one for normal models and the other for exponential moving averaged models. We used the latter for evaluation.

Evaluation

To evaluate the trained models, run

## Test on Non-VOC-COCO
sh tools/run_test.sh configs/VOC-COCO/voc_coco_mask_rcnn_R_50_FPN.yaml weight_to_eval
## Test on UVO, Obj365
sh tools/run_test.sh configs/COCO/mask_rcnn_R_50_FPN.yaml weight_to_eval
## Test on Mapillary
sh tools/run_test.sh configs/Cityscapes/mask_rcnn_R_50_FPN.yaml weight_to_eval

The above script will show two results: agnostic mode and classwise mode. The agnostic mode regards all instances as a single class while classwise mode makes distinction on different classes. To consider class imbalance, we report AR in classwise mode in our paper while reporting AP in agnostic mode. Note that the above script computes performance on novel classes. To get performance on all classes, please disable the flag of exclude_known.