Open cchamber opened 6 years ago
@cchamber if visibility == 0 that keypoint not in the image. if visibility == 1 that keypoint is in the image BUT not visible namely maybe behind of an object. if visibility == 2 that keypoint looks clearly. not hidden.
Is visibility == 1
still used for training and evaluation?
I wonder, whether to annotate the occluded and self-occluded keypoints.
Looking at some COCO person keypoint examples, it looks like:
Here is the description:
" ... the visibility flags of the ground truth (the detector's predicted visibility [is] not used)... These similarities are averaged over all labeled keypoints (keypoints for which visibility > 0). Predicted keypoints that are not labeled (visibility=0) do not affect the [Evaluation]"
Ground truth visibility is used for training and evaluation.
During training, only ground truth visible keypoints (v > 0) are included in the loss.
https://github.com/facebookresearch/Detectron/blob/8170b25b425967f8f1c7d715bea3c5b8d9536cd8/detectron/utils/keypoints.py#L181 https://github.com/facebookresearch/Detectron/blob/8170b25b425967f8f1c7d715bea3c5b8d9536cd8/detectron/roi_data/keypoint_rcnn.py#L75-L91 https://github.com/facebookresearch/Detectron/blob/8170b25b425967f8f1c7d715bea3c5b8d9536cd8/detectron/modeling/keypoint_rcnn_heads.py#L122-L127
Only ground truth visible keypoints (v > 0) are included in OKS calculations. https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/cocoeval.py#L230-L231
In other words, predictions for keypoints labeled v = 0 are not penalized or rewarded in either the loss or the metrics.
It would make sense to label self-occluded points with 1 (or even add a new class), but I don't think COCO enforced this.
are self-occluded keypoints going to have 1 as the visibility flag?
Same question! recommended visibility flag for blurred or self-occlusion joint?
Hey all! I'm also wrestling with uncertainty on how/when to use the visibility flags. Some use cases I'm unsure about (I'm focused here on noses):
Category | Visibility | Use Case |
---|---|---|
1. "Soft" Self-Occlusion | 2? | A person's own hair is occluding their nose. |
2. "Medium" Self-Occlusion | 0? | A person's hand is occluding their own nose. |
3. "Hard" Self-Occlusion | 0? | A person's head is turned away from the camera so the back of their head is occluding their nose. |
4. Other Person Occlusion | 1? | A person's nose if occluded by another person's hand |
5. Wearable Occlusion | 1? | A person's nose if occluded by a wearable like a mask |
6. External Object Occlusion | 1? | A person's nose if occluded by an object like a tree branch or a car's sun visor when viewing the person through the windshield |
7. Blur | 2? | A face is present in the distance and I can guess the location of a keypoint like nose, but the image is too blurry for me to clearly denote the nose. |
8. Low Exposure | 2? | A face is present and I can guess the location of a keypoint like nose, but the image is under exposed (dark) so I can barely make out the nose. |
Any insights from those who have struggled themselves with applying visibility labels? @cchamber - had you come up with a consistent definition that you went by?
Some thoughts: Why not simply annotate everything that a human annotator can guess, including occluded or blurred keypoints?
Disadvantages:
Advantage:
@cchamber I have seen your great work on infants dataset, and I wonder how did you solve the visibility flag problem? thnx
I have a general question about the labelling in the COCO dataset. For the keypoints, the ordering of labels is [x,y,visible,x,y,visible...] What does a label of visibility=1 mean? This clearly covers occlusion, but does it also refer to parts which are not visible because of blur? In my case I am dealing with frames of videos and joints are often blurred because of movement. Thank you!!