YangHai-1218 / SCFlow

Shape-Constraint Recurrent Flow for 6D Object Pose Estimation (CVPR 2023)
GNU General Public License v3.0
45 stars 5 forks source link
6d-pose-estimation deep-learning

Shape-Constraint Recurrent Flow for 6D Object Pose Estimation (CVPR 2023)

Yang Hai, Rui Song, Jiaojiao Li, Yinlin Hu

Paper | Poster | Video

Introduction

Most recent 6D object pose methods use 2D optical flow to refine their results. However, the general optical flow methods typically do not consider the target’s 3D shape information during matching, making them less effective in 6D object pose estimation. In this work, we propose a shape-constraint recurrent matching framework for 6D ob- ject pose estimation.

Figure 1. Different pose refinement paradigms. (a) Most pose refinement methods rely on a recurrent architecture to estimate dense 2D flow between the rendered image I1 and the real input image I2, based on a dynamically-constructed correlation map according to the flow results of the previous iteration. After the convergence of the flow network and lifting the 2D flow to a 3D-to-2D correspondence field, they use PnP solvers to compute a new refined pose. This strategy, however, has a large matching space for every pixel in constructing correlation maps, and optimizes a surrogate matching loss does not reflect the final 6D pose estimation tasl. (b) By contrast, we propose optimizing the pose and flow simultaneously in an end-to-end recurrent framework with the guidance of the target's 3D shape. We impose a shape constraint on the correlation map construction by forcing the construction to comply with the target’s 3D shape, which reduces the matching space significantly. Furthermore, we propose learning the object pose based on the current flow prediction, which, in turn, helps the flow prediction and yields an end-to-end system for object pose
Figure 3. Overview of our shape-constraint recurrent framework. After building a 4D correlation volume between the rendered image and the input target image, we use GRU to predict an intermediate flow, based on the predicted flow Fk−1 and the hidden state hk−1 of GRU from the previous iteration. We then use a pose regressor to predict the relative pose ∆Pk based on the intermediate flow, which is used to update the previous pose estimation Pk−1. Finally, we compute a pose-induced flow based on the displacement of 2D reprojection between the initial pose and the currently estimated pose Pk . We use this pose-induced flow to index the correlation map for the following iterations, which reduces the matching space significantly. Here we show the flow and its corresponding warp results in the dashed boxes. Note how the intermediate flow does not preserve the shape of the target, but the pose-induced flow does.

Installation

This code has been tested on a ubuntu 18.04 server with CUDA 11.3

Dataset Preparation

Testing

Evaluate the performance.

python test.py --config configs/refine_models/scflow.py --checkpoint *** --eval

Save the results.

python test.py --config configs/refine_models/scflow.py  --checkpoint *** --format-only --save-dir ***

Pretrained Models

We put the pretrained models under different training settings at here.

Citation

If you find our project is helpful, please cite:

@inproceedings{yang2023scflow,
    title={Shape-Constraint Flow for 6D Object Pose Estimation},
    author={Yang, Hai and Rui, Song and Jiaojiao, Li and Yinlin, Hu},
    booktitle={Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
    year={2023}}

Acknowledgement

We build this project based on mmflow, GDR-Net, and PFA. We thank the authors for their great code repositories.