Closed stbdang closed 2 years ago
Thanks for the FR!
One option we have to represent not visible is to insert (NaN, NaN)
for those points. The app allows this, and won't render the points. In this case, we would not need a separate way to store visibility flags.
However, visible=True/False
is a more generic concept that we could introduce for other label types like Detections
. So, I think I'm inclined to adopt your idea of optional parallel lists of data. Here's what I'm thinking for naming:
points
: a list of (x, y)
keypoints in [0, 1] x [0, 1]
. If visible[i] == False
then points[i]
can be arbitraryconfidences
: an optional list of confidences corresponding to points
. Defaults to None
visible
: an optional list of True/False
indicating whether each point is visible. If not provided, all points are assumed to be visibleCool, I looked into confidence->confidences and it seems "keypoint-rcnn-resnet50-fpn-coco-torch" does return both "scores" (which I think is mapped to detection instances) and "keypoint_scores" (undocumented but shown in the output) so we can probably replace confidence->confidences and populate per-point confidence from "keypoint_scores" here.
Cool. so here's roughly what needs to be done to make this happen on the backend:
confidence
field with a confidences
field that is a nullable float-list. You can define that like so:confidences = fof.ListField(field=fof.FloatField(), null=True)
That will give us the behavior we want: all Keypoint
instances will now have a confidences
field that is None
by default but is also allowed to be a list of floats.
Update any relevant Keypoint
logic in the codebase to use confidences where applicable. I think you're right that the only real builtin place that uses keypiont confidences are zoo models like keypoint-rcnn-resnet50-fpn-coco-torch
, and the relevant code to change there is the KeypointDetectorOutputProcessor class
Update the App to render per-point confidences in the tooltip, if available. We'll need @benjaminpkane's help on that one
My suggestion would be that we save the visible
field for separate work, since that's a feature that can apply to all label types.
In the meantime, all Label
types can already have arbitrary fields added to them:
import fiftyone as fo
kp1 = fo.Keypoint(
label="cat",
points=[[0, 0], [1, 1]],
visible=[True, False],
)
So I'm basically just saying let's wait on declaring visible
as a default field and expecting the App to recognize and support it
I want to raise the visibility
flag topic. This is IMO the only missing piece to non destructively work with COCO Keypoint datasets. Currently, the keypoints with visibility flag 0 are loaded as (nan, nan) coordinates and other values are loaded unchanged. All keypoints are exported with visibility 2: https://github.com/voxel51/fiftyone/blob/9421782ef07cac0d68a47224f3eaba9c3a0d3f1d/fiftyone/utils/coco.py#L2225
Even after fixing bug https://github.com/voxel51/fiftyone/issues/3309 the export changes all visibility 1 keypoints to visibility 2.
The definition of visibility flag:
v=0: not labeled / not in the image v=1: labeled but not visible, and v=2: labeled and visible.
https://cocodataset.org/#keypoints-eval https://github.com/cocodataset/cocoapi/issues/130
@brimoor Are there any plans on adding the visible
parameter? Currently, it is more difficult to work with the COCO key point format, and in general it makes sense to add the visible
option.
Proposal Summary
Add support for per-point confidence/visibility - currently Keypoint label class (which represents a set of points associated with an instance) is a flat list of (x, y) which doesn't provide additional per-point information. In addition, Coco import skips over the point which visibility = 0 (not visible/annotated) which can break the implicit mapping between joint -> point since num_joints != num_points and there's no information on which joint was skipped.
Motivation
What is the use case for this feature? This is needed to correctly import COCO keypoint dataset as well as filter sample based on per-point confidence...etc.
Why is this use case valuable to support for FiftyOne users in general? Anyone using keypoint detection would probably need this since most of models outputs per-point confidence and there needs to be a way to distinguish in ground truth whether the point is visible or not + mapping between point to joint.
Why is this use case valuable to support for your project(s) or organization? My project needs keypoint dataset - was hoping to leverage existing COCO dataset however the import breaks the mapping so any partially annotated keypoints (e.g. missing things) are useless.
Why is it currently difficult to achieve this use case? (please be as specific as possible about why related FiftyOne features and components are insufficient) One way to workaround would be to only use fully annotated samples (no v=0), which would probably throw out majority of keypoint annotations.
What areas of FiftyOne does this feature affect?
fiftyone
Python libraryDetails
(Use this section to include any additional information about the feature. If you have a proposal for how to implement this feature, please include it here.)
We probably need to fix COCOObject::to_keypoints to not skip over v == 0 case and encode this visibility information to Keypoint by
Option 1. Expand KeypointsField to include confidence/visibility (not sure how it would support "optional-ness") Option 2. Add a separate field "confidences"/"visibilities" in Keypoint class which maps to "points"
Willingness to contribute
The FiftyOne Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature?
I would love to contribute, however I'm not a Python expert + new to FiftyOne so would probably need some guidance.