ru1ven / KeypointFusion

[AAAI2024] Keypoint Fusion for RGB-D Based 3D Hand Pose Estimation
MIT License
6 stars 0 forks source link

How can I Verify the result from myself RGBD data.Can you provide the demo? #2

Open chenyu20230702 opened 6 months ago

chenyu20230702 commented 6 months ago

I want to infer my rgb+depth data using your model, Can you provide the demo process?

ru1ven commented 6 months ago

Hello! Inference on raw RGB-D data is similar to that on HO3D. Since HO3D does not release ground-truth for its evaluation set, we infer the hand root joints from a network in advance, as shown in loader.py. And, if using in-the-wild RGB-D data (not hand-only), you need to predict a bounding box to crop the hand-only images. We provide validation on HO3D as well as visualization of hand joints in train.py.

chenyu19880302 commented 6 months ago

Thanks to answer my question! I have carefully read your code,but I still have a bit of confusion. for example: imgD = self.normalize_img(depth_crop.max(), depth_crop, center_xyz, self.cube_size) In the function,the center_xyz value is obtained from groundtruth. If using in-the-wild RGB-D data ,how can I get it? another question, how can I predict the root joints ?

ru1ven commented 5 months ago

For validation on HO3D and NYU, we use a resnet-18 to regress the root joints in an offline manner, and use them to crop the cubes around the hand in RGB-D images, following the previous depth-based method. This is also applicable to in-the-wild scenarios.

In addition, we will update a simple demo for in-the-wild inference. Welcome to follow our code!

git-xuefu commented 4 months ago

For validation on HO3D and NYU, we use a resnet-18 to regress the root joints in an offline manner, and use them to crop the cubes around the hand in RGB-D images, following the previous depth-based method. This is also applicable to in-the-wild scenarios.

In addition, we will update a simple demo for in-the-wild inference. Welcome to follow our code!

I run the demo code and find that the inference time is a little long, how can I shorten the inference time to reach the millisecond level said in the paper.

ru1ven commented 4 months ago

I run the demo code and find that the inference time is a little long, how can I shorten the inference time to reach the millisecond level said in the paper.

My performance is 40-50 FPS on 4090, how is yours?

git-xuefu commented 4 months ago

I run the demo code and find that the inference time is a little long, how can I shorten the inference time to reach the millisecond level said in the paper.

My performance is 40-50 FPS on 4090, how is yours?

I just run demo_RGBD and the inference time was about 1.6 seconds in 4060

ru1ven commented 4 months ago

I just run demo_RGBD and the inference time was about 1.6 seconds in 4060

It's strange. Please check the code. The inference process should be able to run in real-time on a desktop GPU

git-xuefu commented 4 months ago

I just run demo_RGBD and the inference time was about 1.6 seconds in 4060

It's strange. Please check the code. The inference process should be able to run in real-time on a desktop GPU

Thank you for your reply ,it help me a lot ,I run in 4060 laptop GPU.Maybe I need a better gpu. Is there anything else I can do to improve the inference speed , such as inference in smaller stages.

ru1ven commented 4 months ago

Thank you for your reply ,it help me a lot ,I run in 4060 laptop GPU.Maybe I need a better gpu. Is there anything else I can do to improve the inference speed , such as inference in smaller stages.

Even on 4060, the performance of 1.2s is abnormal. I believe there are some unexpected errors. You can infer a depth-only baseline by discarding any RGB-D fusion module, and ensure this baseline is real-time.