micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
谢谢大佬分享这么优秀的project,研究了一下IAO中的量化训练代码,发现在不融合bn层时,bn使用的是原始浮点的参数,没有像卷积层一样做伪量化,但感觉bn层量化也是有误差的,这里直接使用浮点的数据处理是什么原因呢。另外也尝试了一下融合bn的量化训练方式,只修改了代码中的bn_fuse和train_batch_size(修改为128)参数,发现训练了几个epoch后loss变为nan,没办法继续训练了,大佬能帮忙解答一下吗,谢谢了。