Closed adaber closed 2 months ago
Unofficially we used features from modelopt.onnx.quantization
on windows, the pip installation instruction is same. Some of the modelopt.torch
features might work as well.
Thanks for the quick response, @riyadshairi979 !
My plan is to use modelopt.torch.quantization initially (int8 quantization). I guess I should try and see if it works.
Do you guys plan on Model Optimizer supporting Windows at some point officially , too ?
Thanks!
My plan is to use modelopt.torch.quantization initially (int8 quantization).
If your deployment runtime is TensorRT, then we recommend to use modelopt.onnx.quantization
. Exported ONNX after INT8 PTQ with modelopt.torch.quantization
is not optimal on TensorRT.
Windows at some point officially , too ?
Yes we are working on official support for windows.
Hi @riyadshairi979,
Firstly, thank you for your help!
If your deployment runtime is TensorRT, then we recommend to use
modelopt.onnx.quantization
. Exported ONNX after INT8 PTQ withmodelopt.torch.quantization
is not optimal on TensorRT.
Thanks for sharing this important information. It will definitely save me time because I know what sub-package I should focus on. I will give it a try and report the results. I may follow up with a question or two regarding this particular sub-package.
Windows at some point officially , too ?
Yes we are working on official support for windows.
It's great to hear that.
Thanks!
I guess I already have 2 questions.
1) I assume that Model Optimizer does the calculations on a GPU but I couldn't find an option that allows me to pick what GPU to use in case there is more than one GPU in the system. Is there one ? (I might've missed it, though).
2) Does modelopt.onnx.quantization work with dynamic input shapes ? I did a quick test and "Tensor shape doesn't match for input" popped up. I can see that the ModelOpt's code compares the input_shape and calibration_data sizes and I assume that the input tensor's dynamic sizes might be what caused the assertion error.
Thanks!
@riyadshairi979 Thanks for the quick response. It's very appreciated
Do you happen to know what the approximate time frames are for adding the dynamic shape and GPU support ?
Thanks!
Our next release of modelopt v0.19 will have dynamic shape and GPU support, which will be released on 21 Oct 2024.
That's great to hear, riyadshairi979. Thanks!
@adaber do you have any link to sample ONNX model with dynamic shapes and calibration data to test with? I assume that you are interested in dynamic shapes other than batch dimension. If thats the case, how the calibration tensor shape looks like if the corresponding input tensor has multiple dynamic dimensions, say a shape like [batch_size, 8, dim_2, 16]
.
@riyadshairi979
Sorry for the late response. Didn't think you'd post here and therefore I didn't check if the thread was updated.
I'd try to help as much as I can since we really want to use this tool for model quantization in the future : )
I use fully convolutional neural networks for semantic segmentation and so my dynamic input shape sizes are usually [-1, -1, -1, 3] (batchSize x H x W x NumChannels). It also could be [-1, 3, -1, -1] (batchSize x NumChannels x H x W). I'm not sure if I can provide any samples since they belong to the company I work for but you can just use a UNet model and create simple synthetic samples for sem. segm.
Dynamic shape input and GPU support are crucial for this tool to be efficient for sem. segm. model quantization and so please, let me know if there is anything else I can help with.
Thanks!
Hi,
The main GitHub page lists only Linux but has anyone tested Model Optimizer on Windows ?
Thanks!