[FR] Add default confidence thresholds for some (or all?) detection models

brimoor commented 3 years ago

Due to implementation details, some detection models in the zoo like rfcn-resnet101-coco-tf and ssd-inception-v2-coco-tf always emit a very large number (300 and 100, respectively, in these cases) of detections for every image. This is undesirable because it makes the raw output unreadable in the App and makes the Sample objects pretty huge, which causes things to be slow.

The Dataset.apply_model() method accepts an optional confidence_thresh argument that enables controlling the confidence threshold for any model, but many models also support a parameter like confidence_thresh directly in their associated Config files, so we could hard-code a low confidence threshold like 0.1 in the model zoo manifests, so that the default syntax dataset.apply_model(model) would result in reasonable behavior:

"default_deployment_config_dict": {
    "type": "fiftyone.core.eta_utils.ETAModel",
    "config": {
        "type": "eta.detectors.TFModelsDetector",
        "config": {
            "labels_path": "{{eta}}/tensorflow/models/research/object_detection/data/mscoco_complete_label_map.pbtxt",
            "confidence_thresh": 0.1
        }
    }
}

The only concern here is that some use cases like PR curves may want every detection to be available. The user could recover this behavior as follows:

model = foz.load_zoo_model("rfcn-resnet101-coco-tf)
model.config.confidence_thresh = 0

although this is a bit undocumented. I tried to make model.config.confidence_thresh available on almost all models, but it's not formally defined in the fiftyone.core.models.Model interface.

Example raw output from offending models:

And result after filtering with confidence >= 0.1:

ehofesmann commented 3 years ago

I really don't want people to have to think about the model configs when they just want to run a model. Why not just set a default value of 0.1 to the confidence_thresh kwarg in apply_model()?

brimoor commented 3 years ago

Yeah I suppose that might be the easiest solution. My two reservations on that approach were that:

confidence may have different meanings for certain models. But I guess it is easy enough for the user to override the default in such cases.
apply_model(..., confidence=) is applied post-inference, which means that the offending models that generate tons of predictions would do a bit of extra work to construct Label objects that would then be thrown away. Our data model is a bit slow at times right now, so that may introduce some undesirable performance overhead. But, I think I have a clever solution to avoid that (internally set model.config.confidence if possible before using the model).

Thanks for the nudge to think about making apply_model(..., confidence=) work.

voxel51 / fiftyone

[FR] Add default confidence thresholds for some (or all?) detection models #751