jfzhang95 / PoseAug

[CVPR 2021] PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation, (Oral, Best Paper Award Finalist)
MIT License
366 stars 57 forks source link

3DPW cross-dataset comparison #23

Closed yangchris11 closed 3 years ago

yangchris11 commented 3 years ago

From #9, I was wondering if you can provide the 3dpw file or perhaps the preprocessing code?

I followed the SPIN and your 3dhp preprocessing script but the resulted in some PA-MPJPE around 120 using your models on the test npz I put together myself (16 joints). Thanks!

yangchris11 commented 3 years ago

I guess the main different will probably how you transform the COCO 18 joints (for 2D annotation) and SPML 24 joints (for 3D annotation) from 3DPW to the 16 joints used in your work.

So I will like to see if I can get the preprocess procedure from you to get a fair comparison! Thanks!

Garfield-kh commented 3 years ago

Hi, thank you for the interest! When I process the 3DPW data, I notice that there are some 2D detections which contain invalid values (very large compared with image width/height). As such data will lead to invalid pose prediction and introducing unrelated noise, I removed them in the comparison w/wo poseaug. Another point is that when applying 2D pose normalization, for image (height > width) such as issue #22 . Using the image width in 2D pose normalization may lead to 2D pose value > 1, so we padding the image to be a square such that all the 2D pose values are within -1~+1. The way to achieve this is just replacing the width by height if height > width. Could you try these two?

Garfield-kh commented 3 years ago

Hi, I have uploaded a 16 joints gt version, you may use this one for comparison. Hope this helps. Thank you~

yangchris11 commented 3 years ago

Thank you for your answers! The labels are as you suggested kind of corrupted!

(btw, the gdrive link you provided is not made to public yet)

Garfield-kh commented 3 years ago

Hi, I have update the permission. Can you have a try?

yangchris11 commented 3 years ago

Thanks it works!

yangchris11 commented 3 years ago

I made it into a single npz file just like what you guys did for the 3dhp in here.

People who are also interested in testing out the performance on 3dpw can add these line to their data_preparation

print('=====> Generating 3DPW dataloader...')
pw3d_npz = np.load('data_extra/test_set/test_3dpw.npz')
pw3d_loader = DataLoader(PoseBuffer([pw3d_npz['pose3d']], [pw3d_npz['pose2d']]),
                              batch_size=args.batch_size,
                              shuffle=False, num_workers=args.num_workers, pin_memory=True)

Again, thank you for your replies!

jacksoncsy commented 2 years ago

Another point is that when applying 2D pose normalization, for image (height > width) such as issue #22 . Using the image width in 2D pose normalization may lead to 2D pose value > 1, so we padding the image to be a square such that all the 2D pose values are within -1~+1. The way to achieve this is just replacing the width by height if height > width.

Hello Kehong,

Regarding this problem, maybe another way is to move the 2d joints to the center of a pre-defined square image frame, accordingly, we move the 3d joints to the origin. Also, in inference, we will also move the joints to the center before feeding the network. Do you think this could alleviate the problem when doing cross-database training/testing?

Thanks, Shiyang

Garfield-kh commented 2 years ago

Hi Jacksoncsy,

Root related operation for 2D pose may mislead the perspective information which will affect the performance as state in this paper if I understand correctly.

Regards, Kehong

jacksoncsy commented 2 years ago

Thank you for your reference! Yes, I agree that it could be. However, in this paper they only mentioned about rooting the 2D pose, I agree that without modifying the corresponding 3D pose will definitely cause a great confusion. On the other hand, I could not think of a better way to correct the 3D pose in such case...

jacksoncsy commented 2 years ago

Hi, I have uploaded a 16 joints gt version, you may use this one for comparison.

Another question is that, the 3DPW testset you shared includes around 27k examples, whereas 3DPW testset has over 35k images, may I know what kind of filtering did you do when preprocessing? It seems that only less than 400 poses are out of image border.

I really appreciate your help, thanks in advance!

Garfield-kh commented 2 years ago

Hi, I have uploaded a 16 joints gt version, you may use this one for comparison.

Another question is that, the 3DPW testset you shared includes around 27k examples, whereas 3DPW testset has over 35k images, may I know what kind of filtering did you do when preprocessing? It seems that only less than 400 poses are out of image border.

I really appreciate your help, thanks in advance!

As this is the 2D-3D gt pairs, when I do preprocessing, some of the frames are camera parameter (intrinsic, extrinsic) unavailable. I skiped them as 3D-2D projection cannot be done without camera parameter.

jacksoncsy commented 2 years ago

Oh, I see, this is very useful information. In fact, I did not realise about this! So you actually used GT 2D points when evaluating 3DPW, rathar than predicted 2D joints? Thanks a lot!

Garfield-kh commented 2 years ago

Oh, I see, this is very useful information. In fact, I did not realise about this! So you actually used GT 2D points when evaluating 3DPW, rathar than predicted 2D joints? Thanks a lot!

Hi, both are tried but the released one is under gt setting as the detected one is kind of corrupted mentioned here.

jacksoncsy commented 2 years ago

Hello, I used the GT 2D joints you provided, somehow I have got different results from the paper. They are all much better than reported:

Methods | PA-MPJPE -- | -- STGCN | 69.1 STGCN+PoseAug | 64.73 SemGCN | 96.25 SemGCN+PoseAug | 83.75 VPose | 67.96 VPose+PoseAug | 60.66 Baseline | 67.22 Baseline+PoseAug | 59.49

Could you please help me check this? Many thanks!

Garfield-kh commented 2 years ago

Hello, I used the GT 2D joints you provided, somehow I have got different results from the paper. They are all much better than reported:

Methods PA-MPJPE STGCN 69.1 STGCN+PoseAug 64.73 SemGCN 96.25 SemGCN+PoseAug 83.75 VPose 67.96 VPose+PoseAug 60.66 Baseline 67.22 Baseline+PoseAug 59.49 Could you please help me check this? Many thanks!

Yes, right. The one reported in the paper is under detect 2d setting as mentioned in link.

zlou commented 2 years ago

Is it possible to provide the processed detection 2D poses so we can reproduce the results mentioned in the paper?