Add quantization support for BackboneWithFPN

ntakouris commented 2 years ago

🚀 The feature

Add quantization support for BackboneWithFPN.

Motivation, pitch

Currently, it is possible to use from torchvision.models.detection.backbone_utils.resnet_fpn_backbone/BackboneWithFPN in order to produce a backbone fpn network, given some sort of network that can produce features.

While the original resnet/other base networks you want to use (ex. efficientnet), have quantized variants supported (with pretrained weights too), it would be nice to support QAT in the context of FPN networks, to speed up the detectors even more.

Alternatives

No response

Additional context

No response

datumbox commented 2 years ago

@ntakouris Thanks for the recommendation. Yes that's true, we should provide better support for it.

The PyTorch core team has recently changed the API for quantization (FX-based see here). The feature is still new, so hopefully on the near future once it becomes more stable we can dedicate time to investigate and provide a solution.

ntakouris commented 2 years ago

The other thing that would be good to have, would be to ensure that those new quantized models can actually be exported to ONNX/TensorRT efficiently (with int8 weights).

I am not sure if this is out of scope/too much to ask, as this functionality would be dependent on any possible breaking change at onnx/nvidia's side.

pytorch / vision