模型·大小对比 - Githubissues

666DZY666 / micronet

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape

MIT License

2.22k stars 479 forks source link

模型·大小对比 #99

Open dan123yi opened 2 years ago

dan123yi commented 2 years ago

请问作者的模型大小对比图怎么得到的，我每次改变量化bit后保存的模型大小都一样

YangNuoCheng commented 2 years ago

我也有同样的问题，比如我们使用4bit的量化，量化结果是16个，但它们的值还保持浮点数形式，这样真的可以压缩模型的大小吗？

dan123yi commented 2 years ago

收到

dan123yi commented 2 years ago

作者说了量化是op级的，也就是在推理的时候权重才会被量化

---原始邮件--- 发件人: @.> 发送时间: 2021年12月17日(周五) 下午4:08 收件人: @.>; 抄送: @.**@.>; 主题: Re: [666DZY666/micronet] 模型·大小对比 (Issue #99)

我也有同样的问题，比如我们使用4bit的量化，量化结果是16个，但它们的值还保持浮点数形式，这样真的可以压缩模型的大小吗？

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>

YangNuoCheng commented 2 years ago

这个op级是什么意思呀，意思是使用tensorrt做加速的时候可以自动的把模型量化到对应的整数吗？量化新手，感谢回复！

dan123yi commented 2 years ago

op就是加减乘除运算

---原始邮件--- 发件人: @.> 发送时间: 2021年12月17日(周五) 下午5:04 收件人: @.>; 抄送: @.**@.>; 主题: Re: [666DZY666/micronet] 模型·大小对比 (Issue #99)

这个op级是什么意思呀，意思是使用tensorrt做加速的时候可以自动的把模型量化到对应的整数吗？量化新手，感谢回复！

666DZY666 commented 2 years ago

这个仓库是做量化训练的，模型都是浮点表示的。给出的数据里模型大小是自己理论计算的，这个计算可以搞成代码，大家可以写写加进来。

rourou8023 commented 2 years ago

@dan123yi 您好，麻烦问下，量化后的模型，应该怎么处理呢？我二值化训练之后的模型依然是浮点型的，怎么修改为二值呢？

dan123yi commented 2 years ago

我也想问:sob::sob::sob: