Closed a656418zz closed 1 year ago
Hi, thanks for using MMPose. You can simply use result["predictions"][0][0]
, which is the prediction with the highest bbox score, as the result for the most significant face. At the moment, Inferencer only allows for inference with a batch size of 1.
Hi, thanks for using MMPose. You can simply use
result["predictions"][0][0]
, which is the prediction with the highest bbox score, as the result for the most significant face. At the moment, Inferencer only allows for inference with a batch size of 1.
Thank you very much! May I ask another question? I found that I got the highest bbox score using this method, which saved keypoints [98,2] and keypoint_score[98]. I understand that this corresponds to the coordinates of the keypoints in the image and their confidence scores. However, the input of AdaptWingLoss is ‘torch.Tensor[NxKxHxW]’, which seems to be incompatible in shape. How can I use the keypoints obtained above to calculate the loss?
The heatmaps in shape [N, K, H, W] are generated in codec https://github.com/open-mmlab/mmpose/blob/537bd8e543ab463fb55120d5caaa1ae22d6aaf06/mmpose/codecs/msra_heatmap.py#L107-L111
The model will generate such a heatmap first, and then the heatmap will be decoded into coordinates using the process described in https://github.com/open-mmlab/mmpose/blob/537bd8e543ab463fb55120d5caaa1ae22d6aaf06/mmpose/codecs/msra_heatmap.py#L117-L150
The heatmaps in shape [N, K, H, W] are generated in codec
The model will generate such a heatmap first, and then the heatmap will be decoded into coordinates using the process described in
Thank you very much for your guidance! I have additionally learned other things about it and have a better understanding of the process. In the end this is considered a successful application yes? Here is my code:
from mmpose.apis import MMPoseInferencer
import torch
import mmcv
import numpy as np
from mmpose.codecs import MSRAHeatmap
def getheatmap(img_path):
img = mmcv.imread(img_path)
h,w,_ = img.shape
result_generator = inferencer(img_path,show=False,draw_heatmap=True)
result = next(result_generator)
keypoints_visible = np.array(result["predictions"][0][0]["keypoint_scores"])
keypoints_visible = keypoints_visible[np.newaxis,:]
keypoints = np.array(result["predictions"][0][0]["keypoints"])
keypoints = keypoints[np.newaxis,:]
heatmap_gen = MSRAHeatmap(input_size=(h,w),heatmap_size=(h,w),sigma=2)
heatmap = heatmap_gen.encode(keypoints,keypoints_visible)
return heatmap['heatmaps']
inferencer = MMPoseInferencer(pose2d='face',device="cpu")
img_path = '../mmpose/tests/data/lapa/13609937564_5.jpg'
result1 = torch.tensor(getheatmap(img_path)[np.newaxis,:])
img_path = '../mmpose/tests/data/lapa/13609937564_5.jpg'
result2 = torch.tensor(getheatmap(img_path)[np.newaxis,:])
from mmpose.models.losses import AdaptiveWingLoss
criterion = AdaptiveWingLoss()
loss = criterion(result1,result2)
print(loss)
In my opinion, the code snippet seems to be functioning properly. However, it is not common to use the codec in this manner. Generally, the target heatmap is produced by using ground truth keypoints, rather than predicted keypoints. Additionally, the model generates the predicted heatmap, not the codec.
📚 The doc issue
Hello, thank you very much for your contribution! I have a project that requires two functionalities of mmpose: predicting model outputs and ground-truth keypoints of human faces, and then using AdaptiveWingLoss() to compute losses for my neural network model. As I am new to keypoint-related tasks, I have some conceptual questions.
Here is a question I encountered:
Thank you very much for your help.
Here is my test code: from mmpose.apis import MMPoseInferencer import torch batch_size = 1 img_path = '../mmpose/tests/data/lapa/13609937564_5.jpg' inferencer = MMPoseInferencer(pose2d='face',device="cpu") result_generator = inferencer(img_path,show=False) result = next(result_generator) result1 = torch.tensor(result["predictions"][0][0]["keypoints"])
Suggest a potential alternative/fix
No response