Open ntakouris opened 2 years ago
@ntakouris Thanks for the recommendation. Yes that's true, we should provide better support for it.
The PyTorch core team has recently changed the API for quantization (FX-based see here). The feature is still new, so hopefully on the near future once it becomes more stable we can dedicate time to investigate and provide a solution.
The other thing that would be good to have, would be to ensure that those new quantized models can actually be exported to ONNX/TensorRT efficiently (with int8 weights).
I am not sure if this is out of scope/too much to ask, as this functionality would be dependent on any possible breaking change at onnx/nvidia's side.
🚀 The feature
Add quantization support for
BackboneWithFPN
.Motivation, pitch
Currently, it is possible to use
from torchvision.models.detection.backbone_utils.resnet_fpn_backbone/BackboneWithFPN
in order to produce a backbone fpn network, given some sort of network that can produce features.While the original resnet/other base networks you want to use (ex. efficientnet), have quantized variants supported (with pretrained weights too), it would be nice to support QAT in the context of FPN networks, to speed up the detectors even more.
Alternatives
No response
Additional context
No response