quantized-onnx-models Search Results

1000+ results
for quantized-onnx-models

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/onnxruntime #21535

[Web] Quantized model decreases in size, but takes same amou…

### Describe the issue I have a transformer model from which I'm exporting all the modules (i.e. source embedding, positional encoding, encoder, decoder, projection layer etc) separately to onnx. For…

kabyanil updated 2 months ago
4
microsoft/onnxruntime-genai #771

Builder '-m' does not support quantized models

**Describe the bug** Passing HF model name through '-m' does not work when the model is a quantized model. **To Reproduce** Take https://huggingface.co/TheBloke/WizardLM-30B-GPTQ for example. …

BowenBao updated 3 months ago
2
Deci-AI/super-gradients #1462

Yolo Onnx Export -> Worse Accuracy

### 🐛 Describe the bug Hallo, I tried the tutorial https://github.com/Deci-AI/super-gradients/blob/master/documentation/source/models_export.md where I exported my torch weights to onnx. It seems …

Phyrokar updated 1 month ago
4
huggingface/transformers.js #944

Error: Could not locate file (500 error)

### System Info node: 22.7 nextjs: 14 ### Environment/Platform - [X] Website/web-app - [ ] Browser extension - [X] Server-side (e.g., Node.js, Deno, Bun) - [ ] Desktop app (e.g., Electron) - [ ] O…

iamhenry updated 2 months ago
10
dusty-nv/jetson-inference #1882

Running peoplenet with detectNet on jetPack6

Hello @dusty-nv I downloaded peoplnet directly from : https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/peoplenet. These are the contents of the downloaded folder : labels.txt nvinfer_c…

AkshatJain-TerraFirma updated 4 months ago
3
vllm-project/vllm #10294

[Feature]: Quark quantization format upstream to VLLM

Quark is a comprehensive cross-platform toolkit designed to simplify and enhance the quantization of deep learning models. Supporting both PyTorch and ONNX models, Quark empowers developers to optimiz…

kewang-xlnx updated 3 days ago
5
microsoft/onnxruntime #20208

Intel OneDNN

### Describe the issue I have quantized [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) model using t…

heflinstephenraj-sa-14411 updated 1 month ago
2
openvinotoolkit/openvino #22846

[Bug]: Quantized model does not compile for device type=NPU

### OpenVINO Version 2023.3 ### Operating System Other (Please specify in description) ### Device used for inference NPU ### Framework None ### Model used yolov8 ### Issu…

shashichilappagari updated 1 week ago
27
microsoft/DirectML #664

Example to run yolo model on NPU

I made some sample code to show how to use the NPU to run a yolo model on mp4 files. Currently it runs in real time on my Snapdragon X Elite Dev Box. Twice as fast as the Yolov4 GPU DirectML sample w…

fobrs updated 5 days ago
2
microsoft/onnxruntime #14707

[Performance]why is the inference latency of onnx QDQ quanti…

### Describe the issue I haved a pre-trained CNN model of tensorflow saved model and I convert it to **.onnx form** as well as a **static quantized .onnx form**, and their inference latency at the…

vonJJ updated 5 months ago
5

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for quantized-onnx-models

1000+ results
for quantized-onnx-models