-
**Describe the bug**
I cannot seem to load a dynamically quantized roberta model for cpu inference in ONNX format.
I can load the pre-quantized model just fine. Currently working on a Vertex AI inst…
-
Your inference turned out to be much more accurate than Ultralytic's inference.
I found this interesting thing called DeepSparse. It works faster than ONNX but unfortunately its inference is less a…
kopyl updated
4 months ago
-
Thanks for the great work!
Now I have my own sparsified and GPTQ-quantized model, I'd like to run it in deepsparse to see some inference speedup or other advantages. To export it to ONNX, I tried …
-
**Describe the bug**
A clear and concise description of what the bug is.
**Expected behavior**
A clear and concise description of what you expected to happen.
**Environment**
Include all relevant en…
-
**Describe the bug**
I'm trying to replicate the sample code given in the DeepSparseSentenceTransformer documentation. I'm facing errors while executing it. It is mostly related to the version compat…
-
**Is your feature request related to a problem? Please describe.**
First of all, thank the developer for the example. Here is the frame rate that I want to test for real-time inference rtsp
```
i…
-
The legacy text-generation pipeline supported trust_remote_code as an argument, which was inherited from the Transformers base pipeline here: https://github.com/neuralmagic/deepsparse/blob/491302a5135…
mgoin updated
2 months ago
-
I tried to convert onnx model with dynamic batch size into deepsparse
```
from deepsparse import compile_model
from deepsparse.utils import generate_random_inputs
onnx_filepath = "tts_model.onnx"
…
-
**Describe the bug**
When exporting the YOLOv8s (pruned50-quant, model.pt from sparsezoo) model via the ONNX exporter (sparseml.ultralytics.export_onnx), its performance noticeably decreases compar…
-
```
21/11/2022 07:29:41 PM [ INFO ] Running Nebullvm optimization on CPU
21/11/2022 07:29:43 PM [ WARNING ] Missing Frameworks: tensorflow.
Please install them to include them in the optimization …