Shape-Constraint Recurrent Flow for 6D Object Pose Estimation (CVPR 2023)

Yang Hai, Rui Song, Jiaojiao Li, Yinlin Hu

Introduction

Most recent 6D object pose methods use 2D optical flow to refine their results. However, the general optical flow methods typically do not consider the target’s 3D shape information during matching, making them less effective in 6D object pose estimation. In this work, we propose a shape-constraint recurrent matching framework for 6D ob- ject pose estimation.

**Figure 1. Different pose refinement paradigms.** (a) Most pose refinement methods rely on a recurrent architecture to estimate dense 2D flow between the rendered image I₁ and the real input image I₂, based on a dynamically-constructed correlation map according to the flow results of the previous iteration. After the convergence of the flow network and lifting the 2D flow to a 3D-to-2D correspondence field, they use PnP solvers to compute a new refined pose. This strategy, however, has a large matching space for every pixel in constructing correlation maps, and optimizes a surrogate matching loss does not reflect the final 6D pose estimation tasl. (b) By contrast, we propose optimizing the pose and flow simultaneously in an end-to-end recurrent framework with the guidance of the target's 3D shape. We impose a shape constraint on the correlation map construction by forcing the construction to comply with the target’s 3D shape, which reduces the matching space significantly. Furthermore, we propose learning the object pose based on the current flow prediction, which, in turn, helps the flow prediction and yields an end-to-end system for object pose

**Figure 3. Overview of our shape-constraint recurrent framework.** After building a 4D correlation volume between the rendered image and the input target image, we use GRU to predict an intermediate flow, based on the predicted flow F_k−1 and the hidden state h_k−1 of GRU from the previous iteration. We then use a pose regressor to predict the relative pose ∆P_k based on the intermediate flow, which is used to update the previous pose estimation P_k−1. Finally, we compute a pose-induced flow based on the displacement of 2D reprojection between the initial pose and the currently estimated pose P_k . We use this pose-induced flow to index the correlation map for the following iterations, which reduces the matching space significantly. Here we show the flow and its corresponding warp results in the dashed boxes. Note how the intermediate flow does not preserve the shape of the target, but the pose-induced flow does.

Installation

This code has been tested on a ubuntu 18.04 server with CUDA 11.3

Install necessary packages by pip install -r requirements.txt
Install pytorch3d by building this pytorch3d project

Dataset Preparation

Download YCB-V dataset from the BOP website and place it under the data/ycbv directory.
Download image lists and place them under the data/ycbv/image_lists directory.
Download PoseCNN initial pose and place it under data/initial_poses/ycbv_posecnn directory.
Training

Download the RAFT pretrained model from mmflow and convert the checkpoint.

python tools/mmflow_ckpt_converter.py --model_url https://download.openmmlab.com/mmflow/raft/raft_8x2_100k_flyingthings3d_400x720.pth

Replace the _base_ in the configs/refine_models/scflow.py with different training setting in configs/refine_datasets.

Use train.py.

python train.py --config configs/refine_models/scflow.py

Testing

Evaluate the performance.

python test.py --config configs/refine_models/scflow.py --checkpoint *** --eval

Save the results.

python test.py --config configs/refine_models/scflow.py  --checkpoint *** --format-only --save-dir ***

Pretrained Models

We put the pretrained models under different training settings at here.

Citation

If you find our project is helpful, please cite:

@inproceedings{yang2023scflow,
    title={Shape-Constraint Flow for 6D Object Pose Estimation},
    author={Yang, Hai and Rui, Song and Jiaojiao, Li and Yinlin, Hu},
    booktitle={Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
    year={2023}}

Acknowledgement

We build this project based on mmflow, GDR-Net, and PFA. We thank the authors for their great code repositories.

YangHai-1218 / SCFlow

readme