Open ccslience opened 2 years ago
Were you able to figure it out?
c is the coordinates of the center of the person you want to detect.
s is the size of the image you want to preprocess. As an example, coco uses a size of 384x288, and as s is increased, a wider range of images is included.
Also, I leave an image demo code of mpii dataset
width = 256
height = 256
c = np.array([665, 485]) #[person's center point]
s = np.array([4.2, 4.2]) #[width / 384, width / 288]
r = 0
image_size = [height, width]
trans = get_affine_transform(c, s, r, image_size)
img_trans = cv2.warpAffine(img, trans, (int(image_size[0]), int(image_size[1])), flags=cv2.INTER_LINEAR)
cv2.imwrite('output_trans.jpg', img_trans)
img_trans = F.to_tensor(img_trans)
# Normalize the Tensor with mean and std
mean = [0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]
img_trans = F.normalize(img_trans, mean=mean, std=std)
print(img_trans.shape)
img_trans = torch.reshape(img_trans, (1, 3, width, height))
print(img_trans.shape)
data = {}
data['img'] = img_trans
data['img_metas'] = [{'image_file': '', 'center': c, 'scale': s, 'rotation': r, 'bbox_score': 1,
'flip_pairs': [[0, 5], [1, 4], [2, 3], [10, 15], [11, 14], [12, 13]],
'bbox_id': 0}]
result = model(return_loss=False, **data)
print(result['preds'][0].shape)
add_joints_mpii(img, result['preds'][0])
def add_joints_mpii(image, joints):
mpii_part_orders = [[0, 1], [1, 2], [2, 3], [3, 4], [4, 5], [2, 6], [3, 6], [6, 7], [7, 8], [8, 9], [7, 12], [10, 11], [11, 12], [7, 13], [13, 14], [14, 15]]
for pairs in mpii_part_orders:
if joints[pairs[0]][2] > 0.2 and joints[pairs[1]][2] > 0.2:
cv2.line(image, (int(joints[pairs[0]][0]), int(joints[pairs[0]][1])), (int(joints[pairs[1]][0]), int(joints[pairs[1]][1])), (255, 255, 255), 3)
for joint in joints:
if joint[2] > 0.2:
cv2.circle(image, (int(joint[0]), int(joint[1])), 3, (0, 255, 255), 7)
Thanks for your nice job. However I have some doubt about setting the c and s values. Now I have a image with person's box is [676, 884, 2401, 3904](xmin, ymin, xmax, ymax). And the size of image is (3268, 3989)(width, height), the model input size is (384, 288)(width,height), so I want to know how to calculate the c and s values, especially the value of s.