705062791 / PGBIG

CVPR 2022
43 stars 9 forks source link

Progressively-Generating-Better-Initial-Guesses-Towards-Next-Stages-forHigh-Quality-Human-Motion-Prediction

Official implementation of Progressively Generating Better Initial Guesses Towards Next Stages for High-Quality Human Motion Prediction (CVPR 2022 paper)

[PDF] [Supp] [Demo]

Authors

  1. Tiezheng Ma, School of Computer Science and Engineering, South China University of Technology, China, mtz705062791@gmail.com
  2. Yongwei Nie, School of Computer Science and Engineering, South China University of Technology, China, nieyongwei@scut.edu.cn
  3. Chengjiang Long, Meta Reality Labs, USA, clong1@fb.com
  4. Qing Zhang, School of Computer Science and Engineering, Sun Yat-sen University, China, zhangqing.whu.cs@gmail.com
  5. Guiqing Li, School of Computer Science and Engineering, South China University of Technology, China, ligq@scut.edu.cn

Abstract

    This paper presents a high-quality human motion prediction method that accurately predicts future human poses given observed ones. Our method is mainly based on the observation that a good initial guess of the future pose sequence, such as the mean of future poses, is very helpful to improve the forecasting accuracy. This motivates us to design a novel two-stage prediction strategy, including an init-prediction network that just computes a good initial guess and a formal-prediction network that takes both the historical and initial poses to predict the target pose sequence. We extend this idea further and design a multi-stage prediction framework with each stage predicting initial guess for the next stage, which rewards us with significant performance gain. To fulfill the prediction task at each stage, we propose a network comprising Spatial Dense Graph Convolutional Networks (S-DGCN) and Temporal Dense Graph Convolutional Networks (T-DGCN). Sequentially executing the two networks can extract spatiotemporal features over the global receptive field of the whole pose sequence effectively. All the above design choices cooperating together make our method outperform previous approaches by a large margin (6\%-7\% on Human3.6M, 5\%-10\% on CMU-MoCap, 13\%-16\% on 3DPW).

Overview

PGBIG

Dependencies

DataSet

Human3.6m in exponential map can be downloaded from here.

CMU mocap was obtained from the repo of ConvSeq2Seq paper.

3DPW from their official website.

Train

python main_h36m.py --data_dir [dataset path] --kernel_size 10 --dct_n 35 --input_n 10 --output_n 25 --skip_rate 1 --batch_size 16 --test_batch_size 32 --in_features 66 --cuda_idx cuda:0 --d_model 16 --lr_now 0.005 --epoch 50 --test_sample_num -1

python main_cmu_3d.py --data_dir [dataset path] --kernel_size 10 --dct_n 35 --input_n 10 --output_n 25 --skip_rate 1 --batch_size 16 --test_batch_size 32 --in_features 75 --cuda_idx cuda:0 --d_model 16 --lr_now 0.005 --epoch 50 --test_sample_num -1

--data_dir [dataset path] --kernel_size 10 --dct_n 40 --input_n 10 --output_n 30 --skip_rate 1 --batch_size 32 --test_batch_size 32 --in_features 69 --cuda_idx cuda:0 --d_model 16 --lr_now 0.005 --epoch 50 --test_sample_num -1

Note:

After training, the checkpoint is saved in ./checkpoint/.

Test

Add --is_eval after the above training commands.

The test result will be saved in ./checkpoint/.

Citation

If you think our work is helpful to you, please cite our paper.

Ma T, Nie Y, Long C, et al. Progressively Generating Better Initial Guesses Towards Next Stages for High-Quality Human Motion Prediction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 6437-6446.

Acknowledgments

Our code is based on HisRep and LearnTrajDep

Licence

MIT