PeterL1n / RobustVideoMatting

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
https://peterl1n.github.io/RobustVideoMatting/
GNU General Public License v3.0
8.53k stars 1.13k forks source link

8-bit quantity #116

Closed FengMu1995 closed 2 years ago

FengMu1995 commented 2 years ago

想问下如何从pytorch模型导出8位量化torchscript模型

lblbk commented 2 years ago

你是想把量化的模型转onnx再转dlc?

FengMu1995 commented 2 years ago

不是,直接量化pytorch模型,然后导出为torchscript。但是git上的rvm_mobilenetv3_fp16.torchscript这个模型和我自己量化成int8的模型没跑通,float32能跑通,我不知道是转的有问题还是推理代码有问题

lblbk commented 2 years ago

不是,直接量化pytorch模型,然后导出为torchscript。但是git上的rvm_mobilenetv3_fp16.torchscript这个模型和我自己量化成int8的模型没跑通,float32能跑通,我不知道是转的有问题还是推理代码有问题

量化后的模型应该能转torchscript的 我转过跟这个类似的模型 但导出onnx就不支持了

FengMu1995 commented 2 years ago

你用pytorch量化是动态的还是静态的,我目前没跑通感觉像是输入没有量化而模型被量化了,所以出了问题,我没用onnx,之前用onnx,是因为转其他模型

lblbk commented 2 years ago

你用pytorch量化是动态的还是静态的,我目前没跑通感觉像是输入没有量化而模型被量化了,所以出了问题,我没用onnx,之前用onnx,是因为转其他模型

qat量化 量化成功的话会有两个值scale和zero_point 可以在netron中看到 另外可以用inference一下结果的

FengMu1995 commented 2 years ago

那个downsample ratio怎么量化,比如说降采样率为0.5,量化之后是int8,就不能达到原来降采样的效果了吧

PeterL1n commented 2 years ago

@FengMu1995 you don't need to quantize everything. You only need to quantize weights and biases to improve inference speed. Things like downsample_ratio is a model setting which makes sense to leave it as float32. That's why float16 onnx model still uses float32 for downsample_ratio.

FengMu1995 commented 2 years ago

Well, I see. Thank you very much.

FengMu1995 commented 2 years ago

@PeterL1n 可以共享一下这个rvm_mobilenetv3_fp16.torchscript模型的推理代码吗,我试了好久感觉直接用pytorch量化的模型有问题

PeterL1n commented 2 years ago

It's the same as fp32.torchscript. But you have to call .half() on input tensors.

FengMu1995 commented 2 years ago

我把输入改float16,报如下错误,这个问题你有遇到吗 image

PeterL1n commented 2 years ago

If you are already using the latest PyTorch, then PyTorch doesn't support fp16 upsample on CPU. You can write an issue under PyTorch repo.

FengMu1995 commented 2 years ago

@PeterL1n ,I try the torch1.8 and torch 1.9 versions, it still reports this error,which version did you use to run the fp16?

FengMu1995 commented 2 years ago

@PeterL1n I run it on gpu successfully, it seems pytorch-cpu has not supported float16.

xuxiaolon commented 2 years ago

你用pytorch量化是动态的还是静态的,我目前没跑通感觉像是输入没有量化而模型被量化了,所以出了问题,我没用onnx,之前用onnx,是因为转其他模型

qat量化 量化成功的话会有两个值scale和zero_point 可以在netron中看到 另外可以用inference一下结果的

Hi, 请问RVM 的QAT你用的是那种方法, pytorch fx吗? 我们在改写过程中遇到了很多问题, 可否分享一下大概的经验呢?

zhanghongyong123456 commented 2 years ago

想问下如何从pytorch模型导出8位量化torchscript模型

请问大佬,我想知道rvm_mobilenetv3_fp16.torchscript 模型是如何转换的,假如我获得了pytorch 训练得到的.pth模型 如何转换为 libtorch可使用的.torchscript 模型,您能分享一下转换代码吗,谢谢,对于量化这块,速度上能提升多少,期待您的回复

zhanghongyong123456 commented 2 years ago

你是想把量化的模型转onnx再转dlc?

请问大佬,我想知道rvm_mobilenetv3_fp16.torchscript 模型是如何转换的,假如我获得了pytorch 训练得到的.pth模型 如何转换为 libtorch可使用的.torchscript 模型,您能分享一下转换代码吗,谢谢,对于量化这块,速度上能提升多少,期待您的回复