Code for "Category-Level 6D Object Pose and Size Estimation using Self-Supervised Deep Prior Deformation Networks". ECCV2022.
Created by Jiehong Lin, Zewei Wei, Changxing Ding, and Kui Jia.
The code has been tested with
Some dependent packages:
Download the data provided by NOCS (camera_train, camera_test, camera_composed_depths, real_train, real_test, ground truths, and mesh models) and segmentation results (Link), and unzip them in data folder as follows:
data
├── CAMERA
│ ├── train
│ └── val
├── camera_full_depths
│ ├── train
│ └── val
├── Real
│ ├── train
│ └── test
├── gts
│ ├── val
│ └── real_test
├── obj_models
│ ├── train
│ ├── val
│ ├── real_train
│ └── real_test
├── segmentation_results
│ ├── train_trainedwoMask
│ ├── test_trainedwoMask
│ └── test_trainedwithMask
└── mean_shapes.npy
Run the following scripts to prepare the dataset:
python data_processing.py
Train DPDN under unsupervised setting:
python train.py --gpus 0,1 --config config/unsupervised.yaml
Train DPDN under supervised setting:
python train.py --gpus 0,1 --config config/supervised.yaml
Download trained models and test results [Link]. Evaluate our models under different settings:
python test.py --config config/unsupervised.yaml
python test.py --config config/supervised.yaml
or directly evaluate our results on REAL275 test set:
python test.py --config config/unsupervised.yaml --only_eval
python test.py --config config/supervised.yaml --only_eval
One can also evaluate our models under the easier unsupervised setting with mask labels for segmentation (still without pose annotations):
python test.py --config config/unsupervised.yaml --mask_label
or
python test.py --config config/unsupervised.yaml --mask_label --only_eval
Qualitative results on REAL275 test set:
IoU25 | IoU75 | 5 degree 2 cm | 5 degree 5 cm | 10 degree 2 cm | 10 degree 5 cm | |
---|---|---|---|---|---|---|
unsupervised | 72.6 | 63.8 | 37.8 | 45.5 | 59.8 | 71.3 |
unsupervised (with masks) | 83.0 | 70.3 | 39.4 | 45.0 | 59.8 | 72.1 |
supervised | 83.4 | 76.0 | 46.0 | 50.7 | 70.4 | 78.4 |
If you find our work useful in your research, please consider citing:
@inproceedings{lin2022category,
title={Category-level 6D object pose and size estimation using self-supervised deep prior deformation networks},
author={Lin, Jiehong and Wei, Zewei and Ding, Changxing and Jia, Kui},
booktitle={European Conference on Computer Vision},
pages={19--34},
year={2022},
organization={Springer}
}
Our implementation leverages the code from NOCS, DualPoseNet, and SPD.
Our code is released under MIT License (see LICENSE file for details).
lin.jiehong@mail.scut.edu.cn