Open dragen1860 opened 5 months ago
Thank you for your interest in our work! We haven't tried ONNXRuntime yet, we think it is applicable. MixDQ adopts the standard and deployment-friendly quantization scheme, We have already tested MixDQ with the pytorch_quantization deployment tool.
If you are interested in deploying MixDQ with onnxrumtime or other tool,s we are also open for discussion and support, PRs are welcomed!
Hi, dear author: The memory reduce is very attractive and will benefits its application. I wonder does current onnx support the techniques you proposed and inference with onnxruntime framework?