666DZY666 / micronet

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
MIT License
2.2k stars 478 forks source link

关于WbWtAb中二值化实现的问题 #40

Closed LumingSun closed 3 years ago

LumingSun commented 3 years ago

很棒的代码! 有几个关于二值化实现的问题想请教: 1.根据paper,STE是用在weight的BP上,为什么WbWtAb的实现中STE是在Binary_a而不是Binary_w上? 2.二值化deterministic方法直接使用torch.sign(),0依然是0,是否成为了三值化? 3.util_wt_bab.py中activation_bin的forward函数,A!=2时,为何要加relu函数? 4.根据其他issue,readme 中二/三值化后model size和压缩率是手动计算的。那bit量化后的结果也是计算的吗? 感谢!

666DZY666 commented 3 years ago

1、都有。 weight: https://github.com/666DZY666/Model-Compression-Deploy/blob/3959f194033a520d40fca4c2758874681981ea3c/compression/quantization/WbWtAb/util_wbwtab.py#L44 activation: https://github.com/666DZY666/Model-Compression-Deploy/blob/3959f194033a520d40fca4c2758874681981ea3c/compression/quantization/WbWtAb/util_wbwtab.py#L20 2、sign是的,但其实没有0. 3、这个就是A不做二值化时的激活。 4、可以算。(后续代码里会加上)