[Paper] | [Project Page] | [Demo Video]
If you use this code for your research, please cite:
Learning to Detect Every Thing in an Open World.
Kuniaki Saito, Ping Hu, Trevor Darrell, Kate Saenko. In Arxiv 2021. [Bibtex]
[2022/02/20] We find a bug in evaluation code and are rewriting the paper. Please contact the authors if you plan to cite our papers.
[2022/03/11] We fixed the bug in ldet/evaluation/coco_evaluation.py. The pascal class id was not indexed correctly. We will further update the repo following our updated paper.
[2022/04/11] Updated the evaluation code following new paper.
Requirements
Build LDET
conda
to create a new environment.
conda create --name ldet
conda activate ldet
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
Currently, the codebase is compatible with Detectron2 v0.6. Example for PyTorch v1.10.0 and CUDA v11.3:
python -m pip install detectron2 -f \
https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
python3 -m pip install -r requirements.txt
We provide evaluation on COCO, UVO, and Mapillary (v2.0) in this repository:
COCO. Trained on train split, evaluated on validation split. Download the COCO dataset following the instruction of detectron2.
UVO. We downloaded uvo_videos_sparse.zip and evaluated on the videos. Follow their instructions to split videos into frames. The json file split used for evaluation is available in Dropbox Link Update the line in builtin.py.
E.g., the data structure of UVO dataset is as follows:
uvo_frames_sparse/video1/0.png
uvo_frames_sparse/video1/1.png
.
.
.
uvo_frames_sparse/video2/0.png
.
Cityscapes. Follow detectron2's instruction. Update the line in builtin.py.
Mapillary. Update the line in builtin.py.
Obj365. We used validation set. We pick 5000 images for evaluation. The json is available here. Please rewrite the path of the lists following your environment.
The trained weights are available from link attached with model.
Method | Training Dataset | Evaluation Dataset | box AP |
box AR |
seg AP |
seg AR |
Link |
---|---|---|---|---|---|---|---|
Mask RCNN | VOC-COCO | Non-VOC | 1.5 | 10.9 | 0.7 | 9.1 | model | config |
Mask RCNNS | VOC-COCO | Non-VOC | 3.4 | 18.0 | 2.2 | 15.8 | model | config |
LDET | VOC-COCO | Non-VOC | 5.0 | 30.8 | 4.7 | 27.4 | model | config |
Mask RCNNS | COCO | UVO | 25.3 | 42.3 | 20.6 | 35.9 | model | config |
LDET | COCO | UVO | 25.8 | 47.5 | 21.9 | 40.7 | model | config |
To train a model, run
## Training on VOC-COCO
sh tools/run_train.sh configs/VOC-COCO/voc_coco_mask_rcnn_R_50_FPN.yaml save_dir
## Training on COCO
sh tools/run_train.sh configs/COCO/mask_rcnn_R_50_FPN.yaml save_dir
## Training on Cityscapes
sh tools/run_train.sh configs/Cityscapes/mask_rcnn_R_50_FPN.yaml save_dir
Note that the training will produce two directories, i.e., one for normal models and the other for exponential moving averaged models. We used the latter for evaluation.
To evaluate the trained models, run
## Test on Non-VOC-COCO
sh tools/run_test.sh configs/VOC-COCO/voc_coco_mask_rcnn_R_50_FPN.yaml weight_to_eval
## Test on UVO, Obj365
sh tools/run_test.sh configs/COCO/mask_rcnn_R_50_FPN.yaml weight_to_eval
## Test on Mapillary
sh tools/run_test.sh configs/Cityscapes/mask_rcnn_R_50_FPN.yaml weight_to_eval
The above script will show two results: agnostic mode and classwise mode. The agnostic mode regards all instances as a single class while classwise mode makes distinction on different classes. To consider class imbalance, we report AR in classwise mode in our paper while reporting AP in agnostic mode. Note that the above script computes performance on novel classes. To get performance on all classes, please disable the flag of exclude_known.