Closed ghost closed 2 years ago
Hi, the value of v is the confidence of the prediction. Generally, it will be [0, 1]. You may set a threshold, for example v > 0.2 as visible, and v < 0.2 as invisible. The threshold should be set according to your own dataset.
Hi open-mmlab developers, I am using a pretrained model (coco-format) to perform image inference. I do not know how to interpret the keypoint values given by the model. For example according to coco docs : v can either be 0,1 or 2 but the model gives float values in the output Below is one of the output I got while performing image inference
[{'bbox': array([ 0.6854594 , 45.018677 , 88.19361 , 265.10806 , 0.99963367], dtype=float32), 'keypoints': array([[ 37.382774 , 78.91617 , 0.9567224 ], [ 39.541847 , 74.59802 , 0.94851094], [ 30.90554 , 74.59802 , 0.9755936 ], [ 41.70092 , 65.961716 , 0.8276397 ], [ 20.11016 , 68.12079 , 0.9391268 ], [ 50.337227 , 76.757095 , 0.9199661 ], [ 4.9966354 , 85.3934 , 0.8950063 ], [ 74.08706 , 98.347855 , 0.88189924], [ 2.837555 , 113.46138 , 0.60554993], [ 78.405205 , 89.71156 , 0.9048376 ], [ 2.837555 , 111.30232 , 0.58759665], [ 30.90554 , 141.52937 , 0.8425735 ], [ 4.9966354 , 148.0066 , 0.8193674 ], [ 46.01908 , 189.02904 , 0.9078411 ], [ 2.837555 , 201.9835 , 0.90722847], [ 37.382774 , 243.00597 , 0.8663102 ], [ 2.837555 , 247.32408 , 0.2709621 ]], dtype=float32)
All of the values of v are in float. Also the model fails to tell how many keypoints it has detected per person ( This info is provided in coco ground truth annotations ).I have used the following code block to generate the outputs.
Please guide me on how I can interpret the value of v that I get during inference and also on how to get model outputs that are in line with coco ground truth annotations for person keypoint detection task.