About the Input Image Size - Githubissues

leoxiaobin / deep-high-resolution-net.pytorch

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

https://jingdongwang2017.github.io/Projects/HRNet/PoseEstimation.html

MIT License

4.33k stars 912 forks source link

About the Input Image Size #122

Open eng100200 opened 5 years ago

eng100200 commented 5 years ago

@leoxiaobin Sorry to bother you, I have a question concerning about the size of the input image, it is mentioned in paper as "We extend the human detection box in height or width to a fixed aspect ratio: height : width = 4 : 3, and then crop the box from the image, which is resized to a fixed size, 256 × 192", what is this mean? You mean if the image in COCO is 300x400, we then extend its width or height to obtain the ratio 4:3 and then crop the extended image to resize to 256x 192? is it? if it is then the corresponding labels are need to be modified. Please reply.

leoxiaobin commented 5 years ago

We extend the person bounding box to a fixed ratio 4:3, then crop the person bounding box from the image. Finally we resize the cropped person image patch to 256x192 or 384x288 for training or testing.

eng100200 commented 5 years ago

@leoxiaobin thank you for reply. It means you have cropped all the persons from the original image and then resize these cropped persons as new seperate single image of sizes 256 x 192 for training, is it? Then it also need to re-label the key points, since the original keypoint labels will be different, is it?

eng100200 commented 5 years ago

@leoxiaobin resize each of these cropped persons are resized as 256 x 192 and then form new images for training, i.e. each person will be a new single seperate image. is it?

leoxiaobin commented 5 years ago

yes

eng100200 commented 5 years ago

@leoxiaobin morning, how are you? If i directly input the coco images for training and use the same the label data as in coco dataset, then, what will be the affect on training and performance? Do you have any idea? Have you tried it?

eng100200 commented 5 years ago

@leoxiaobin hello, can i ask you question?

eng100200 commented 5 years ago

@leoxiaobin i dont understand your code does not compute scale and center for mpii dataset, while, it is provided in json files. Now, further, there are no detection bounding boxes in validation json file, so, do we need to detect the boxes first before testing the validation set?

1334233852 commented 6 months ago

Hello, may I ask if my data input size is 1280x720, or is the model adaptive to any size of image input? How did you solve this problem?