NVIDIA-AI-IOT / trt_pose

Real-time pose estimation accelerated with NVIDIA TensorRT
MIT License
972 stars 287 forks source link

How to get the keypoint out? #79

Open AK51 opened 3 years ago

AK51 commented 3 years ago

Hi, I am new to pose. I want to use the pose to control a physical robot. Like to mimic the movement of a real person. I can print out the cmap, paf, counts, objects and peak. May I know where I can know more about these variables? And which one I should use for the servos motor? An example of extracting the shoulder rotation will be great. Thanks

guillebot commented 3 years ago

I want the same! How do you get the points?

varhidibence commented 3 years ago

Do you mean how to get coordinates of keypoints on the image?

guillebot commented 3 years ago

Sorry to hijack this question @AK51, but I think we are looking for the same. It is in fact that, how to get the coordinates? I would like for example to programatically know if the arms are over the head. Thanks

varhidibence commented 3 years ago

Sorry to hijack this question @AK51, but I think we are looking for the same. It is in fact that, how to get the coordinates? I would like for example to programatically know if the arms are over the head. Thanks

I wrote actually a method for get keypoints to a Python dict:

def get_keypoints(image, human_pose, topology, object_counts, objects, normalized_peaks):
"""Get the keypoints from torch data and put into a dictionary where keys are keypoints
and values the x,y coordinates. The coordinates will be interpreted on the image given.

Args:
    image: cv2 image
    human_pose: json formatted file about the keypoints

Returns:
    dictionary: dictionary where keys are keypoints and values are the x,y coordinates
"""
height = image.shape[0]
width = image.shape[1]
keypoints = {}
K = topology.shape[0]
count = int(object_counts[0])

for i in range(count):
    obj = objects[0][i]
    C = obj.shape[0]
    for j in range(C):
        k = int(obj[j])
        if k >= 0:
            peak = normalized_peaks[0][j][k]
            x = round(float(peak[1]) * width)
            y = round(float(peak[0]) * height)
            keypoints[human_pose["keypoints"][j]] = (x, y)

return keypoints