Hybrid quantization with PyTorch FX

vincentfung13 commented 2 years ago

Hi,

I am trying to convert my PyTorch model to RKNN, I am using Torch FX (https://pytorch.org/tutorials/prototype/fx_graph_mode_quant_guide.html) as the quantization framework. There are some parts of my model that are not suitable for quantziation / will result in significant performance drop, so I skipped these layers when running PTQ.

Here are the issues:

If I don't skip any layers in PTQ, the model can be converted to RKNN without any problem, and the performance matches with the torchscript model;
If I skip any layer in PTQ, there will be an error when calling rknn.load_pytorch:

I am assuming this is because RKNN is trying to find the quantization params in all layers, which do not exist when layers are skipped (e.g the skipped layer would be Conv2d instead of QuantizedConv2d).

Is there a workaround for this problem? Thanks!

zen-xingle commented 1 year ago

Sorry, Hybrid quant does not support PTQ/QAT model at this version.

vincentfung13 commented 1 year ago

Thanks for your reply, do you plan on supporting this in the next version? We found the built-in quantization tool to be unstable, it would be really helpful if hybrid quantization for PyTorch is supported.

rockchip-linux / rknn-toolkit

Hybrid quantization with PyTorch FX #264