666DZY666 / micronet

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
MIT License
2.2k stars 478 forks source link

可以Post-training quantization 训练后量化压缩吗? #29

Closed tianhaoyue closed 1 year ago

tianhaoyue commented 4 years ago

您好,我目前所做的项目训练好的.pth格式的目标检测模型有460兆左右,希望对其进行压缩处理。请问对于训练后的模型,如.pth模型文件,可以应用您的方法对其进行压缩吗? 如果是边训练边量化,我能否在一个基于MaskRcnn的而且是斜框检测的目标检测网络中应用您的方法去实现模型的压缩呢?非常希望您能回复我的问题,万分感谢!

666DZY666 commented 3 years ago

1、ptq可参考deploy-tensorrt部分; 2、可以试试先把backbone量化看看效果。(使用方法参考,使用_迁移)

JSnobody commented 3 years ago

@tianhaoyue PQT目前适合做8-bit量化,4-bit量化效果不太好。