AlvinYH / Faster-VoxelPose

Official implementation of Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection
MIT License
153 stars 18 forks source link

A little performance drop when running this code, ask for HigherHRnet version #17

Open cucdengjunli opened 1 year ago

cucdengjunli commented 1 year ago

实验结果1

论文官方结果

Dear authors: It is grateful to read your paper and code. when i try to run this project to reproduce your paper work. my result is dropped about 2mm, could you explain why ?

is your code responde to this setting? using [5 views; mask; weights;].

my conda environment is that, show in the picture:

my GPU is RTX3090, cuda11.3 , torch1.11.0

环境1 环境2

cucdengjunli commented 1 year ago

this is my validation result on panoptic. my training set is the same as /Faster-VoxelPose-main/configs/panoptic/jln64.yaml

实验结果1

cucdengjunli commented 1 year ago

the backbone you provide is Resnet , the backbone you mentioned in paper is HRnet, maybe this is the reason. Let me exchange the backbone and see the result

your core/config.py show that you use HigherHRnet

cucdengjunli commented 1 year ago

thanks for your perfect job! Could you please offer a HigherHRnet backbone Version?

the code maybe like this: https://github.com/HRNet/HigherHRNet-Human-Pose-Estimation/blob/master/lib/models/pose_higher_hrnet.py https://github.com/HRNet/HigherHRNet-Human-Pose-Estimation/blob/master/experiments/coco/higher_hrnet/w32_512_adam_lr1e-3.yaml

gpastal24 commented 1 year ago

Hi, how did you manage to train the model on RTX30 series gpu? Did you make any changes to the code?

cucdengjunli commented 1 year ago

Hi, how did you manage to train the model on RTX30 series gpu? Did you make any changes to the code?

https://github.com/microsoft/voxelpose-pytorch/issues/19

I try this and succeed

gpastal24 commented 1 year ago

Hi, how did you manage to train the model on RTX30 series gpu? Did you make any changes to the code?

microsoft/voxelpose-pytorch#19

I try this and succeed

I did something similar, eventually. I returned the total loss at each iteration, and I got 18.6mm 3d error.

`

    loss = loss_dict["total"]
    loss_2d = loss_dict["2d_heatmaps"]
    loss_1d = loss_dict["1d_heatmaps"]
    loss_bbox = loss_dict["bbox"]
    loss_joint = loss_dict["joint"]

    losses.update(loss.item())
    losses_2d.update(loss_2d.item())
    losses_1d.update(loss_1d.item())
    losses_bbox.update(loss_bbox.item())
    losses_joint.update(loss_joint.item())

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

`

Screenshot from 2023-02-07 10-06-27

Mosh-Wang commented 1 year ago

Hi, how did you manage to train the model on RTX30 series gpu? Did you make any changes to the code?

microsoft/voxelpose-pytorch#19 I try this and succeed

I did something similar, eventually. I returned the total loss at each iteration, and I got 18.6mm 3d error.

`

    loss = loss_dict["total"]
    loss_2d = loss_dict["2d_heatmaps"]
    loss_1d = loss_dict["1d_heatmaps"]
    loss_bbox = loss_dict["bbox"]
    loss_joint = loss_dict["joint"]

    losses.update(loss.item())
    losses_2d.update(loss_2d.item())
    losses_1d.update(loss_1d.item())
    losses_bbox.update(loss_bbox.item())
    losses_joint.update(loss_joint.item())

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

`

Screenshot from 2023-02-07 10-06-27

Hello, it’s convenient to ask you exactly how to change the part of loss? How is loss_dict defined?

gpastal24 commented 1 year ago

@Mosh-Wang I just changed the code to backprop the total loss at every batch iteration. Loss dict is returned by the FVP model. I didn't do anything fancy.

https://github.com/AlvinYH/Faster-VoxelPose/blob/4daaedad466b9c95b1e9b35cfabd496b60e6013a/lib/models/voxelpose.py#L74-L80

Mosh-Wang commented 1 year ago

@Mosh-Wang I just changed the code to backprop the total loss at every batch iteration. Loss dict is returned by the FVP model. I didn't do anything fancy.

https://github.com/AlvinYH/Faster-VoxelPose/blob/4daaedad466b9c95b1e9b35cfabd496b60e6013a/lib/models/voxelpose.py#L74-L80

Thank you very much for your reply. Can I ask you one more question? That is, when I am training Panoptic, I use TRAIN_HEATMAP_SRC: 'image' and TEST_HEATMAP_SRC: 'image' in the original config, and I get the following error. Do you also use this setting? Or have you changed it? Or what do you think is the reason? ref

gpastal24 commented 1 year ago

@Mosh-Wang https://github.com/AlvinYH/Faster-VoxelPose/blob/4daaedad466b9c95b1e9b35cfabd496b60e6013a/lib/dataset/JointsDataset.py#L71

Change this to input_heatmaps or

https://github.com/AlvinYH/Faster-VoxelPose/blob/4daaedad466b9c95b1e9b35cfabd496b60e6013a/lib/dataset/JointsDataset.py#L168

to input_heatmap. They are not used anyway if you are using the image for training and testing. Just make a quick check before waiting for a whole epoch that this resolves the problem both for training and valdating

gpastal24 commented 1 year ago

@Mosh-Wang regarding the other q. I dont know if it matters that much. If you try both approaches would you br kind to let us know if training the 2D network as well, increases the performance of the method?

zaie commented 1 year ago

Maybe the omitted loss_off cause the performance drop. https://github.com/AlvinYH/Faster-VoxelPose/issues/26

cucdengjunli commented 1 year ago

i reproduce the higherhrnet version backbone

gpastal24 commented 1 year ago

@cucdengjunli

i reproduce the higherhrnet version backbone

Did you get the same results as the paper?

AlvinYH commented 1 year ago

Hi, @cucdengjunli. Thanks for your interest in our work. We've modified the code and you can pull the recent release. Yes, we make several changes to the model architecture (remove the offset branch and reduce the feature dimension in the weight_net). And the experimental results are slightly different from the one in the original paper. Specifically, as for Panoptic dataset, the mpjpe increases a little (+0.15mm) while the new model yields an improvement of 1.44 in terms of AP25. You can download the pre-trained checkpoints. We'll revise our paper to specify these alternations. Also, thanks for pointing our mistake. We did use Pose ResNet for training on Panoptic Dataset instead of HigherHRNet. We'll fix this typo in the final version. And using HigherHRNet is expected to further reduce the errors.

cucdengjunli commented 1 year ago

@cucdengjunli

i reproduce the higherhrnet version backbone

Did you get the same results as the paper?

yes , mpjpe@500mm: 17.966

cucdengjunli commented 1 year ago

Hi, @cucdengjunli. Thanks for your interest in our work. We've modified the code and you can pull the recent release. Yes, we make several changes to the model architecture (remove the offset branch and reduce the feature dimension in the weight_net). And the experimental results are slightly different from the one in the original paper. Specifically, as for Panoptic dataset, the mpjpe increases a little (+0.15mm) while the new model yields an improvement of 1.44 in terms of AP25. You can download the pre-trained checkpoints. We'll revise our paper to specify these alternations. Also, thanks for pointing our mistake. We did use Pose ResNet for training on Panoptic Dataset instead of HigherHRNet. We'll fix this typo in the final version. And using HigherHRNet is expected to further reduce the errors.

thank you!

CodeCrusader66 commented 1 year ago

Hi, @cucdengjunli. Thanks for your interest in our work. We've modified the code and you can pull the recent release. Yes, we make several changes to the model architecture (remove the offset branch and reduce the feature dimension in the weight_net). And the experimental results are slightly different from the one in the original paper. Specifically, as for Panoptic dataset, the mpjpe increases a little (+0.15mm) while the new model yields an improvement of 1.44 in terms of AP25. You can download the pre-trained checkpoints. We'll revise our paper to specify these alternations. Also, thanks for pointing our mistake. We did use Pose ResNet for training on Panoptic Dataset instead of HigherHRNet. We'll fix this typo in the final version. And using HigherHRNet is expected to further reduce the errors.

I have alse reproduce the higherHRnet version code 😄,the result is as the same as your paper said. may I send you a merge request?