dexsuite / dex-retargeting

https://yzqin.github.io/anyteleop/
MIT License
194 stars 23 forks source link

How To do position retarget in real world? #20

Closed BIT-wh closed 4 weeks ago

BIT-wh commented 4 months ago

Hi: it's a awesome work. I have a leap hand in my Lab. I want to use my origin laptop's camera to capture my hand's finger pose, then use dex-retargeting to retarget my finger pose to my leap hand. However my os is Win11. Can I use it? How can I do it? Maybe my question is so easy, but I really need your help. Thank you so much : ) (I am now able to achieve joint control of dexterous hands, but I dont know how to deploy your codes) :)

yzqin commented 3 months ago

Your use case is perfect for this code. Check out the vector_retargeting examples/tutorials and follow the step-by-step tutorial to learn how to retarget a hand, like a Leap Motion hand, using a laptop's camera. The examples and tutorial will guide you through the process. If you have any questions, feel free to reach out. Happy coding!

BIT-wh commented 3 months ago

sorry to bother again. When I install the dex-retargeting, there are some mistakes about cmeel_*, eg. cmeel_boost. When I tried to solve it, I realized that cmeel-boost doesn't seem to support windows?

yzqin commented 3 months ago

First, the issue you met should be related to the installation of pin (Pinocchio) package. Based on the information provided in the Pinocchio documentation, it appears that Windows should be supported, but the installation process can be a bit tricky.

According to user feedback from previous issues, using conda to install Pinocchio instead of pip seems to be a more reliable solution on Windows.

Here's a suggestion for installing the library on Windows:

  1. Create and activate a conda environment.
  2. Install Pinocchio using conda-forge:
    conda install pinocchio -c conda-forge
  3. Install the dex_retargeting library using pip:
    pip install dex_retargeting

For more detailed instructions and additional information, please refer to the official Pinocchio tutorials, which provide comprehensive guidance on getting started with Pinocchio.

BIT-wh commented 3 months ago

Your use case is perfect for this code. Check out the vector_retargeting examples/tutorials and follow the step-by-step tutorial to learn how to retarget a hand, like a Leap Motion hand, using a laptop's camera. The examples and tutorial will guide you through the process. If you have any questions, feel free to reach out. Happy coding!

Thanks for your reply, I have now installed an ubuntu 20.04 VM and can now successfully install dex_retargeting. I went through the vector_retargeting examples/tutorials and only saw how to retarget the hand and render it into sapien, but my computer is not good enough to run sapien. I would like to retarget directly to a realistic leaphand, but i didn't see how to do it?

yzqin commented 3 months ago

The SAPIEN example demonstrates how to control a simulated robot to visualize retargeting results, as most users don't have access to a real robot hand. Real-world robotics code heavily depends on hardware-specific drivers, which is beyond the scope of this repo. This repo focuses solely on the retargeting computation algorithm.

The retargeting algorithm outputs joint positions for each robot joint. To control your robot, use these joint positions as input for your robot's joint position controller, whether in simulation or the real world.

BIT-wh commented 3 months ago

Thank you for your reply, I have solved the problem above. Now the query I am facing is that qpos is the joint angle obtained by the RETARGET, how do the 16 joint angles here correspond to the joint nodes of the mediapipe? Or how does it correspond to the joints of leaphand? (because I want to apply it on the actual leaphand) Overall, I would like to know the joint correspondence of qpos. Looking forward to your reply, thanks!

yzqin commented 3 months ago

This is a good question!

I have updated the readme for joint orders, you can also use a similar method to handle the joint order mapping between retargeting and leap hand driver: README HERE

BIT-wh commented 3 months ago

Thank you so much! What an amazing project.

I have now successfully applied it to my leap hand. May I ask, is the repositioning of the robotic arm ready to be open source? Or any recommended open source projects? Our group's robotic arm is arriving in about two weeks, and I'd like to try to put the leap hand on the end of the robotic arm as an actuator. If you can help me, I will be thankful to the best of my ability.

You are such a helpful and selfless expert!

yzqin commented 3 months ago

Hi @BIT-wh

I would like to keep this repo simple and focused only on hands. For motion control of robot arm for teleoperation, you can check our other project here: https://github.com/Dingry/BunnyVisionPro

It is targeted at bimanual robots with two arms and two hands.

BIT-wh commented 3 months ago

Thank you for your generous reply! Bunny-VisionPro is also an amazing project!

I looked at your Bunny-VisionPro project and it seems like I have to have Apple VisionPro. But I only have my laptop's camera and an rgb-d camera. I don't seem to be able to deploy Bunny-VisionPro.

Possibly do you have any other recommendations for a lightweight open source project for robotic arm wrist retargeting?

UIOSN commented 2 months ago

Thank you for your generous reply! Bunny-VisionPro is also an amazing project!

I looked at your Bunny-VisionPro project and it seems like I have to have Apple VisionPro. But I only have my laptop's camera and an rgb-d camera. I don't seem to be able to deploy Bunny-VisionPro.

Possibly do you have any other recommendations for a lightweight open source project for robotic arm wrist retargeting?

Yeah, I've met the same problem. Have you found a solution of robotic arm wrist retargeting?

idombanker commented 2 months ago

Hi @BIT-wh

I would like to keep this repo simple and focused only on hands. For motion control of robot arm for teleoperation, you can check our other project here: https://github.com/Dingry/BunnyVisionPro

It is targeted at bimanual robots with two arms and two hands.

Hi @yzqin,

In the paper associated with this repo, I read about

"Wrist Pose Detection from RGB-D. We use the pixel positions of the detected keypoints to retrieve the corresponding depth values from the depth image. Then, utilizing known intrinsic camera parameters, we compute the 3D positions of the keypoints in the camera frame. The alignment of the RGB and depth images is handled by the camera driver. With the 3D keypoint positions in both the local wrist frame and global camera frame, we can estimate the wrist pose using the Perspective-n-Point (PnP) algorithm."

However, I can't seem to find any module in the repo addressing this part mentioned in the paper.

Should I refer to other papers like Bunny-VisionPro, even though it uses Apple VisionPro instead of a regular depth camera like RealSense? Or should I re-implement it based on the description from the paper?

Could you please confirm if this repo doesn't contain any modules related to wrist pose detection, as previously mentioned in your comment on Jun 21 about keeping the repo focused only on hands?

Thank you.

yzqin commented 2 months ago

Hello everyone,

As discussed previously, I've decided to maintain this repository solely for hand retargeting purposes. Here are the reasons:

  1. Numerous excellent repositories already exist for inverse kinematics control of robot arms, such as pink and ikbt. These provide ample resources for robot arm control. However, similar resources for hand retargeting are scarce, which motivated the creation of this repository.

  2. Including arm-related components, like curobo, would introduce heavy dependencies, complicating installation and setup. It would also add GPU dependencies, whereas the current code is CPU-only, allowing it to run on less powerful machines.

  3. Specifically, NVIDIA's licensing of curobo would conflict with this repository's MIT license, limiting its flexibility and usage.

yzqin commented 2 months ago

@idombanker

This repo has the code for wrist orientation detection here. But the inverse kinematics code is not provided in this repo.

By the way, you can also use the kinematics code inside bunny vision pro for motion control even without a vision pro.

idombanker commented 2 months ago

@idombanker

This repo has the code for wrist orientation detection here. But the inverse kinematics code is not provided in this repo.

By the way, you can also use the kinematics code inside bunny vision pro for motion control even without a vision pro.

Thank you, @yzqin.

The code you referred to for wrist orientation seems to be purely RGB input without using depth info, which is different from what was mentioned in the paper:

"Wrist Pose Detection from RGB-D. We use the pixel positions of the detected keypoints to retrieve the corresponding depth values from the depth image. Then, utilizing known intrinsic camera parameters, we compute the 3D positions of the keypoints in the camera frame. The alignment of the RGB and depth images is handled by the camera driver. With the 3D keypoint positions in both the local wrist frame and global camera frame, we can estimate the wrist pose using the Perspective-n-Point (PnP) algorithm."

While Mediapipe can be leveraged to provide wrist orientation, we still need the position to fully teleoperate any arm and end-effector or just a floating hand in a simulator. The wrist position remains a bottleneck, as I haven't seen any visual method achieving operator-friendly teleoperation levels. I believe this still falls under "hand" and should be included even if the repo focuses on hands.

IK is not an issue, as there are mature solutions available. I noticed you mentioned using FrankMocap for wrist position in another discussion here. I'll look into FrankMocap, though its installation and integration might be challenging due to its older dependency versions. By the way, FrankMocap is no longer maintained. May I ask what SOTA alternatives you might suggest?

Thanks again.

Update:

I checked BunnyVisionPro and found the wrist pose is from ARKit HandTrackingProvider and is only available for iOS and VisionOS. So, using other common commercial depth cameras will not be feasible. I was thinking about using your method directly to process human demonstration video data and teleoperate a robot, but it seems there is still a significant gap to overcome.

UIOSN commented 2 months ago

@idombanker

This repo has the code for wrist orientation detection here. But the inverse kinematics code is not provided in this repo.

By the way, you can also use the kinematics code inside bunny vision pro for motion control even without a vision pro.

Thanks for your reply @yzqin . I 've tried reusing the ik algorithm in bunnyvisionpro, but I fail to locate the code. It seems that the wrist pose isn't taken out and calculated alone. If you could show me the code I'll be most grateful.

yzqin commented 2 months ago

@idombanker This repo has the code for wrist orientation detection here. But the inverse kinematics code is not provided in this repo. By the way, you can also use the kinematics code inside bunny vision pro for motion control even without a vision pro.

Thank you, @yzqin.

The code you referred to for wrist orientation seems to be purely RGB input without using depth info, which is different from what was mentioned in the paper:

"Wrist Pose Detection from RGB-D. We use the pixel positions of the detected keypoints to retrieve the corresponding depth values from the depth image. Then, utilizing known intrinsic camera parameters, we compute the 3D positions of the keypoints in the camera frame. The alignment of the RGB and depth images is handled by the camera driver. With the 3D keypoint positions in both the local wrist frame and global camera frame, we can estimate the wrist pose using the Perspective-n-Point (PnP) algorithm."

While Mediapipe can be leveraged to provide wrist orientation, we still need the position to fully teleoperate any arm and end-effector or just a floating hand in a simulator. The wrist position remains a bottleneck, as I haven't seen any visual method achieving operator-friendly teleoperation levels. I believe this still falls under "hand" and should be included even if the repo focuses on hands.

IK is not an issue, as there are mature solutions available. I noticed you mentioned using FrankMocap for wrist position in another discussion here. I'll look into FrankMocap, though its installation and integration might be challenging due to its older dependency versions. By the way, FrankMocap is no longer maintained. May I ask what SOTA alternatives you might suggest?

Thanks again.

Update:

I checked BunnyVisionPro and found the wrist pose is from ARKit HandTrackingProvider and is only available for iOS and VisionOS. So, using other common commercial depth cameras will not be feasible. I was thinking about using your method directly to process human demonstration video data and teleoperate a robot, but it seems there is still a significant gap to overcome.

Hi @idombanker

Yes, I agree with you. If you are not searching for the ik code but only the hand position detection code from a single camera, I recommend you to check code in my previous project. It can locate the hand wrist position in 3D space. But it also use FrankMocap internally for this feature.

Link here: https://github.com/yzqin/dex-hand-teleop/blob/3f7b56deed878052ec733a32b503aceee4ca8c8c/hand_detector/hand_monitor.py#L102

yzqin commented 2 months ago

@idombanker This repo has the code for wrist orientation detection here. But the inverse kinematics code is not provided in this repo. By the way, you can also use the kinematics code inside bunny vision pro for motion control even without a vision pro.

Thanks for your reply @yzqin . I 've tried reusing the ik algorithm in bunnyvisionpro, but I fail to locate the code. It seems that the wrist pose isn't taken out and calculated alone. If you could show me the code I'll be most grateful.

@UIOSN

If you are looking for the inverse kinematics code, you can find it here.

If you are looking for the hand-wrist pose detection code, please check the conversation above in this issue discussion.

UIOSN commented 2 months ago

@idombanker This repo has the code for wrist orientation detection here. But the inverse kinematics code is not provided in this repo. By the way, you can also use the kinematics code inside bunny vision pro for motion control even without a vision pro.

Thanks for your reply @yzqin . I 've tried reusing the ik algorithm in bunnyvisionpro, but I fail to locate the code. It seems that the wrist pose isn't taken out and calculated alone. If you could show me the code I'll be most grateful.

@UIOSN

If you are looking for the inverse kinematics code, you can find it here.

If you are looking for the hand-wrist pose detection code, please check the conversation above in this issue discussion.

@yzqin ok, thanks for your reply.

git-xuefu commented 1 week ago

Hi: it's a awesome work. I have a leap hand in my Lab. I want to use my origin laptop's camera to capture my hand's finger pose, then use dex-retargeting to retarget my finger pose to my leap hand. However my os is Win11. Can I use it? How can I do it? Maybe my question is so easy, but I really need your help. Thank you so much : ) (I am now able to achieve joint control of dexterous hands, but I dont know how to deploy your codes) :)

Hello, may I ask where I can buy the eap hand you mentioned?