-
### Describe the issue
I have a transformer model from which I'm exporting all the modules (i.e. source embedding, positional encoding, encoder, decoder, projection layer etc) separately to onnx. For…
-
**Describe the bug**
Passing HF model name through '-m' does not work when the model is a quantized model.
**To Reproduce**
Take https://huggingface.co/TheBloke/WizardLM-30B-GPTQ for example.
…
-
### 🐛 Describe the bug
Hallo, I tried the tutorial
https://github.com/Deci-AI/super-gradients/blob/master/documentation/source/models_export.md
where I exported my torch weights to onnx. It seems …
-
### System Info
node: 22.7
nextjs: 14
### Environment/Platform
- [X] Website/web-app
- [ ] Browser extension
- [X] Server-side (e.g., Node.js, Deno, Bun)
- [ ] Desktop app (e.g., Electron)
- [ ] O…
-
Hello @dusty-nv
I downloaded peoplnet directly from : https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/peoplenet. These are the contents of the downloaded folder :
labels.txt nvinfer_c…
-
Quark is a comprehensive cross-platform toolkit designed to simplify and enhance the quantization of deep learning models. Supporting both PyTorch and ONNX models, Quark empowers developers to optimiz…
-
### Describe the issue
I have quantized [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) model using t…
-
### OpenVINO Version
2023.3
### Operating System
Other (Please specify in description)
### Device used for inference
NPU
### Framework
None
### Model used
yolov8
### Issu…
-
I made some sample code to show how to use the NPU to run a yolo model on mp4 files.
Currently it runs in real time on my Snapdragon X Elite Dev Box. Twice as fast as the Yolov4 GPU DirectML sample w…
-
### Describe the issue
I haved a pre-trained CNN model of tensorflow saved model and I convert it to **.onnx form** as well as a **static quantized .onnx form**, and their inference latency at the…
vonJJ updated
5 months ago