Open LinasKo opened 8 months ago
The good news is - the error mostly appears when Detections.merge
is called on results from unusual models - where confidence
is missing, where data
is set, where mask
exists (and there's no detections at least once).
Affected functions:
supervision/detection/core.py -> from_inference()
supervision/detection/core.py -> from_roboflow()
dataset/formats/yolo.py -> yolo_annotations_to_detections()
dataset/formats/yolo.py -> load_yolo_annotations()
dataset/formats/pascal_voc.py -> detections_from_xml_obj()
dataset/formats/coco.py -> coco_annotations_to_detections()
supervision/detection/core.py -> merge()
tracker/byte_tracker/core.py -> update_with_detections()
supervision/detection/tools/inference_slicer.py -> __call__()
detection/tools/smoother.py -> DetectionsSmoother.get_smoothed_detections()
See Colab for more details.
TL;DR: I think Option 3 is the least-bad solution. Adds complexity in exactly one location, keeps design and architecture intact, stops errors, prevents future worries. I also advocate for two eventual changes regardless of option taken (final section)
Here's how I see various mitigation options and their downsides. I can only speculate about the design goals and future plans of Roboflow, and I do not have access to the internal message board. So my goal now is to bring other devs up to speed and help think through the solutions.
Detections
is meant to be universal and easily-understood, fitting both detection and segmentation models.Detections.empty
, which represents empty detections of ANY type.data
field, but found an example where it fails for the same reason.The error has two stages:
Detections.empty
from specific model loaders sets it up - this happens frequently.Detections.merge
triggers it - this happens rarely.merge
is:
InferenceSlicer
Maybe it's not worth fixing? There haven't been many complaints.
The devs just need to be extra careful of using merge
in anything new, largely relating to instance segmentation. Who knows - maybe there's not that many features left unimplemented.
(The worst approach, in my opinion) Very radical - comment out merge and inference slicer, disallow smoothing. Who cares if there's incompatible objects around - they're not interacting with each other anyway. (I care).
I like how merge
works, given the assumptions. But maybe it's harmful, even if correct?
If (xyxy
, confidence
, class_id
) merges with (xyxy
, mask
, class_id
), maybe find a way to let it happen? Form (xyxy
, mask
, confidence
, class_id
) somehow.
(Brainstorming) When we need new values for:
xyxy
- make a bounding box around mask
, 0-sized box otherwise.confidence
- default to 1
.class_id
- set to -1
(raaaarely seen - e.g. from_sam
)tracker_id
- won't be an issue after #928 - in other cases update_with_detections
ensures the ID is set.mask
- expand xyxy
into a rectangular mask.data
- I don't know enough to say.
It seems to avoid both errors and architectural shifts, redefining what "compatible Detections" mean, at least until the next feature comparing multiple detections is added. Is this the least-bad solution?empty
sets (xyxy
, confidence
, class_id
). What is that? A detection model.
Let's make it initialize (xyxy
, mask
, class_id
) for segmentation models. Some won't conform - SAM returns no class_id
. So force it, make a mock id! This way you'd have two distinct model types.
Then, either pass an egument to Detections.empty
to tell which one you're initializing, or define two different empty
functions. For the outliers - full field set detections and those models with data
- either set the fields explicitly, or add a third, more flexible Detections.empty
. Same thing.
Did you spot an issue?
Suppose you have an intermediary function. A call chain of:
Smoother
-> some_func
-> specific_empty
.
Which type of empty
should it call? Somehow, with an argument or context of some sort, some_func
needs to know which type of variables to initialize / which empty
to call. At present, if there's no detections such an intermediary would only receive an empty list - []
. No type information!
That, or you need to forbid intermediaries from accepting []
, return a specific empty()
from the top-level caller instead.
Here's a thought experiment: try inlining Detections.empty
. I mean, replace it with the Detections
constructor and then explicitly set variables inside. In each case, can you tell which ones should be set? Try this inside Detections.merge
to see the issue.
We're passing the type information with function calls, elsewhere that would be done via templates or generics. Detections.empty<Det>()
, Detections.empty<Seg>()
might be completely normal in other languages.
Type parameter syntaxis only introduced in 3.12. Perhaps TypeVar, could help, but generally this would make the code less accessible to new devs.
Did you notice that we're approaching a multi-class solution? My claim is that Detections
can be at least two distinct objects - the detection and the segmentation result.
An option is to change the core design, split them in two, then define strict non-nullable members, adjust everything else to match that.
Detections
becomes a base class passed to users and annotators, those have to frequently check contents. Notice that with nullable members similar checks might already be happening (I haven't checked).
Downside:
data
and non-standard parameter sets are an issue - this solution is an issue.This is my least thought-out option. Null values serve a purpose, representing invalid members - E.g. "This Detections
result will never have confidence
". This requires extra validation, disallows safely having a strong Detections.empty
function such as the one we have now.
Perhaps Optional
status can be revoked. There needs to be logic such that whether there's an empty mask
or never any mask would look the same. Is there harm to that?
I'm think so. Null checks can be removed in places like __iter__()
, but in a for
loop, is there a way for users to easily tell which fields might suddenly produce values?
I advocate for these two changes to be made eventually.
Detections.empty
is meant to construct empty detections for ANY model. Therefore it HAS to be weaker-defined than any model. The code now sets (xyxy, confidence, class_id)
. Does every model have at least all of these? No. Trim it down until it can be used as the constructor for Detections
. Otherwise you'll be setting fields that normal model results may never have and constructing invalid objects.
After this is done - can Detections
constructor be used instead of Detections.empty()
?tracker_id
from other members. Is is addressed differently, has a special set of methods ensuring its existence and is not produced by the models. A single empty line would give a hint to maintainers of its special status - miniscule thing, but it would help learn the code faster.We've now implemented option 3 in https://github.com/roboflow/supervision/pull/1177
Ideally, I'd like to see an attempt at trimming down Detections.empty
to just xyxy
before closing this.
Would that cause issues? Would that be easy fine? Would that entail too much work to be worth it? I'd like to know that.
Search before asking
This is the underlying reason for #928. I believe it is important enough to deserve a separate issue.
Bug
Detectons.empty()
creates a very specific type ofDetections
, which is guaranteed to be incompatible with some models when merging results.Example
Suppose we run an instance segmentation model on a video. It finds one detection in frame 1, but no detections in frame 2. When we try calling
Detections.merge
on these detections, it raises an error.Error type 1
ValueError Traceback (most recent call last)Error type 2
ValueError Traceback (most recent call last) [Deeper explanation
Detections
contains these variables:Suppose we call
Detections.merge([detections_1, detections_2])
. An error is rightfully raised when the same variable is defined differently in the detections. For example - whenmask
isNone
indetections_1
andnp.array([])
indetections_2
. It makes sense as we don't want to merge incompatible detections.However, when calling
Detections.empty()
, frequently used to initialize the no-detections case, it sets a specific subset of the fields - only thexyxy
,condifence
andclass_id
. Many models use other fields as well. Because of this, the set of defined fields in an empty detection might be different to what the model returns. When trying to merge these, an error is raised.Environment
No response
Minimal Reproducible Example
See aforementioned example.
this Colab contains examples, list of functions affected.
Are you willing to submit a PR?
I could help submit a PR when I figure out a solution, but this is a non-trivial bug and to my best estimates, solving it 'correctly' would require a redesign of
Detections
. Quick remediation may mean band-aid code and hope that future features are written defensively whenmerge
is involved.