8-bit quantity - Githubissues

FengMu1995 commented 2 years ago

想问下如何从pytorch模型导出8位量化torchscript模型

lblbk commented 2 years ago

你是想把量化的模型转onnx再转dlc?

FengMu1995 commented 2 years ago

不是，直接量化pytorch模型，然后导出为torchscript。但是git上的rvm_mobilenetv3_fp16.torchscript这个模型和我自己量化成int8的模型没跑通，float32能跑通，我不知道是转的有问题还是推理代码有问题

lblbk commented 2 years ago

不是，直接量化pytorch模型，然后导出为torchscript。但是git上的rvm_mobilenetv3_fp16.torchscript这个模型和我自己量化成int8的模型没跑通，float32能跑通，我不知道是转的有问题还是推理代码有问题

量化后的模型应该能转torchscript的我转过跟这个类似的模型但导出onnx就不支持了

FengMu1995 commented 2 years ago

你用pytorch量化是动态的还是静态的，我目前没跑通感觉像是输入没有量化而模型被量化了，所以出了问题，我没用onnx，之前用onnx，是因为转其他模型

lblbk commented 2 years ago

你用pytorch量化是动态的还是静态的，我目前没跑通感觉像是输入没有量化而模型被量化了，所以出了问题，我没用onnx，之前用onnx，是因为转其他模型

qat量化量化成功的话会有两个值scale和zero_point 可以在netron中看到另外可以用inference一下结果的

FengMu1995 commented 2 years ago

那个downsample ratio怎么量化，比如说降采样率为0.5，量化之后是int8，就不能达到原来降采样的效果了吧

PeterL1n commented 2 years ago

@FengMu1995 you don't need to quantize everything. You only need to quantize weights and biases to improve inference speed. Things like downsample_ratio is a model setting which makes sense to leave it as float32. That's why float16 onnx model still uses float32 for downsample_ratio.

FengMu1995 commented 2 years ago

Well， I see. Thank you very much.

FengMu1995 commented 2 years ago

@PeterL1n 可以共享一下这个rvm_mobilenetv3_fp16.torchscript模型的推理代码吗，我试了好久感觉直接用pytorch量化的模型有问题

PeterL1n commented 2 years ago

It's the same as fp32.torchscript. But you have to call .half() on input tensors.

FengMu1995 commented 2 years ago

我把输入改float16，报如下错误，这个问题你有遇到吗

PeterL1n commented 2 years ago

If you are already using the latest PyTorch, then PyTorch doesn't support fp16 upsample on CPU. You can write an issue under PyTorch repo.

FengMu1995 commented 2 years ago

@PeterL1n ，I try the torch1.8 and torch 1.9 versions, it still reports this error，which version did you use to run the fp16?

FengMu1995 commented 2 years ago

@PeterL1n I run it on gpu successfully, it seems pytorch-cpu has not supported float16.

xuxiaolon commented 2 years ago

你用pytorch量化是动态的还是静态的，我目前没跑通感觉像是输入没有量化而模型被量化了，所以出了问题，我没用onnx，之前用onnx，是因为转其他模型

qat量化量化成功的话会有两个值scale和zero_point 可以在netron中看到另外可以用inference一下结果的

Hi, 请问RVM 的QAT你用的是那种方法, pytorch fx吗? 我们在改写过程中遇到了很多问题, 可否分享一下大概的经验呢?

zhanghongyong123456 commented 2 years ago

想问下如何从pytorch模型导出8位量化torchscript模型

请问大佬，我想知道rvm_mobilenetv3_fp16.torchscript 模型是如何转换的，假如我获得了pytorch 训练得到的.pth模型如何转换为 libtorch可使用的.torchscript 模型，您能分享一下转换代码吗，谢谢，对于量化这块，速度上能提升多少，期待您的回复

zhanghongyong123456 commented 2 years ago

你是想把量化的模型转onnx再转dlc?

请问大佬，我想知道rvm_mobilenetv3_fp16.torchscript 模型是如何转换的，假如我获得了pytorch 训练得到的.pth模型如何转换为 libtorch可使用的.torchscript 模型，您能分享一下转换代码吗，谢谢，对于量化这块，速度上能提升多少，期待您的回复

PeterL1n / RobustVideoMatting

8-bit quantity #116