Open IzanCatalan opened 9 months ago
Training will not work with a quantized model. How do you expect that training work with an INT8 model (backpropagation can only happen with floats).
The error you're hitting is a result of onnxruntime trying to convert your graph to a QAT graph. QAT with onnxruntime is still under experimental phase and we do not have complete support for it.
Thanks for the reply @baijumeswani . Yes, you are totally right, backpropagation cannot be done. I just hoped and asked, if there is a way of re-training a model using as you said QAT or Post Training Quantization with ORT. Will the support you mentioned soon be available, or is it a long-term plan?
Anyway, If I must re-train some models to INT8, as you said with ORT currently would be impossible, do you have any thoughts on how I could do it (using QAT, for instance) even with a different framework or AI Engine? Any help to clarify things would be highly appreciated.
Thank you.
Yes, we will add some support for training a (fake) quantized model in some sense in the near to mid term. Maybe you can benefit from that. This is expected to be out in onnx runtime 1.18. Will keep you posted on that.
I am not aware of any framework that offers training of quantized models on the device. Sorry about that.
Describe the issue
I am re-training some onnx models from ONNX Model Zoo Repo, especially quantised Resnet50 with INT8 datatype. However, when creating the artifacts according to onnx-runtime-training-examples Repo I get the following error:
I would like to know what to do to solve it. Is there any way of retraining or doing Transfer Learning with ORT ?
For helping, my code looks like this:
To reproduce
I am running onnxruntime build from source for cuda 11.2, GCC 9.5, cmake 3.27 and python 3.8 with ubuntu 20.04.
Urgency
As soon as possible
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
onnxruntime-training 1.17.0+cu112
PyTorch Version
None
Execution Provider
CUDA
Execution Provider Library Version
Cuda 11.2