单独对bn层进行量化，不进行bn_fold的量化实施方案

666DZY666 / micronet

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape

MIT License

2.2k stars 478 forks source link

单独对bn层进行量化，不进行bn_fold的量化实施方案 #51

Open ghost opened 3 years ago

ghost commented 3 years ago

大佬，您好，我想请问一下，如果我不对bn层进行和conv层的融合，而只针对bn层做单独的量化，这一块应该如何实施，是对bn层当中的gama参数等效为conv层中的w，并对输入进bn层的特征图进行量化，这样的思路是对的吗？

666DZY666 commented 3 years ago

可以试试。但一般是bn融合量化。