Closed tsamiss closed 2 months ago
hi @tsamiss deepsparse does not support dynamic quantization. Static quantization model training and export is provided in neuralmagic/sparseml
As there have been no further comments, I will go ahead and close out this issue. Best, Jeannie / Neural Magic
Describe the bug I cannot seem to load a dynamically quantized roberta model for cpu inference in ONNX format. I can load the pre-quantized model just fine. Currently working on a Vertex AI instance on GCP.
Expected behavior The Engine is expected to load the model.
Environment Include all relevant environment information:
f7245c8
]: 1.7.1To Reproduce
Errors
Additional context When using static quantization this error does not occur.