cocodataset / cocoapi

COCO API - Dataset @ http://cocodataset.org/
Other
6.12k stars 3.76k forks source link

OKS derive Keypoint Constant k values #539

Open babciaapcia opened 3 years ago

babciaapcia commented 3 years ago

Hello,

I'm working with custom dataset on a keypoint detection task and I'm interested in applying OKS metric to my results. Section 1.3 in https://cocodataset.org/#keypoints-eval describes how those values have been derived, however, I tend to get different numbers when I try to derive them for myself whilst also using validation set.

The formula: σ2 = E[d_i2 / s**2] where s is the square root of area. I'm not entirely sure where d_i comes from, in 1.2 the d_i is clearly defined as the difference between ground truth and predictions but here we don't have any predictions, rather a set of pre-annotated keypoints.

I tried deriving d_i in the following way instead: for keypoint i (17 different keypoints); for every visible annotation j i.e. v>0; get means of (x_i,y_i) i.e. (ux_i, uy_i) for current keypoint i and subtract that from each (x_ij,y_ij) keypoint position giving us (dx_ij, dy_ij). I then used (dx_ij, dy_ij) and computed it's magnitude d_ij. The magnitude was then scaled by objects scale2 (area) s_j so: `σ_ij2 = d_ij**2/s_j `

Finally to get σ_i I took a mean of all σ_ij.

Results are completely different then: sigmas = np.array([.26, .25, .25, .35, .35, .79, .79, .72, .72, .62,.62, 1.07, 1.07, .87, .87, .89, .89])/10.0

Can anyone explain the appropriate way to derive those values? Cheers

AmeurSoualmi commented 2 years ago

Hello, thank you for this question, did you get any answers so far? because i'm looking also how to define di for setting ki

torigara603 commented 2 years ago

Hi. I am also interested in this discussion.

I think I'm using duplicate data annotated by multiple annotators. GroundTruth is the average coordinate of the overlapping keypoints, and Predictions is the overlapping keypoints. And by calculating variation σ ^ 2 with all the duplicate data, he can calculate std σ for each key point.

Therefore, ki = 2σ.