Open adaber opened 1 week ago
Unofficially we used features from modelopt.onnx.quantization
on windows, the pip installation instruction is same. Some of the modelopt.torch
features might work as well.
Thanks for the quick response, @riyadshairi979 !
My plan is to use modelopt.torch.quantization initially (int8 quantization). I guess I should try and see if it works.
Do you guys plan on Model Optimizer supporting Windows at some point officially , too ?
Thanks!
My plan is to use modelopt.torch.quantization initially (int8 quantization).
If your deployment runtime is TensorRT, then we recommend to use modelopt.onnx.quantization
. Exported ONNX after INT8 PTQ with modelopt.torch.quantization
is not optimal on TensorRT.
Windows at some point officially , too ?
Yes we are working on official support for windows.
Hi @riyadshairi979,
Firstly, thank you for your help!
If your deployment runtime is TensorRT, then we recommend to use
modelopt.onnx.quantization
. Exported ONNX after INT8 PTQ withmodelopt.torch.quantization
is not optimal on TensorRT.
Thanks for sharing this important information. It will definitely save me time because I know what sub-package I should focus on. I will give it a try and report the results. I may follow up with a question or two regarding this particular sub-package.
Windows at some point officially , too ?
Yes we are working on official support for windows.
It's great to hear that.
Thanks!
I guess I already have 2 questions.
1) I assume that Model Optimizer does the calculations on a GPU but I couldn't find an option that allows me to pick what GPU to use in case there is more than one GPU in the system. Is there one ? (I might've missed it, though).
2) Does modelopt.onnx.quantization work with dynamic input shapes ? I did a quick test and "Tensor shape doesn't match for input" popped up. I can see that the ModelOpt's code compares the input_shape and calibration_data sizes and I assume that the input tensor's dynamic sizes might be what caused the assertion error.
Thanks!
@riyadshairi979 Thanks for the quick response. It's very appreciated
Do you happen to know what the approximate time frames are for adding the dynamic shape and GPU support ?
Thanks!
Hi,
The main GitHub page lists only Linux but has anyone tested Model Optimizer on Windows ?
Thanks!