quantized-onnx-models Search Results

1000+ results
for quantized-onnx-models

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DirectML #664

Example to run yolo model on NPU

I made some sample code to show how to use the NPU to run a yolo model on mp4 files. Currently it runs in real time on my Snapdragon X Elite Dev Box. Twice as fast as the Yolov4 GPU DirectML sample w…

fobrs updated 5 days ago
2
microsoft/onnxruntime #15563

Error running quantize_dynamic: Failed to find proper ai.onn…

### Describe the issue When running this: ```python import os def quantize_onnx_model(onnx_model_path, quantized_model_path): from onnxruntime.quantization import quantize_dynamic, QuantType …

bogedy updated 3 months ago
7
invoke-ai/InvokeAI #7231

[bug]: Offload parts of models when exceeding vram?

### Is there an existing issue for this problem? - [X] I have searched the existing issues ### Operating system Windows ### GPU vendor Nvidia (CUDA) ### GPU model RTX 3060 ### GPU VRAM 12GB …

Jonseed updated 1 month ago
5
microsoft/onnxruntime #22219

[Performance] Very slow load of ONNX model in Windows

### Describe the issue I am trying to load XGBoost onnx models using onnxruntime on Windows machine. The model size is 52 mb and the RAM it is consuming on loading is 1378.9 MB. The time to load …

dhatraknilam updated 1 month ago
3
pytorch/pytorch #137419

Convert LSTM quantized by QAT to onnx

### 🐛 Describe the bug ```torch.onnx.errors.SymbolicValueError: ONNX symbolic expected the output of `%2212 : Tensor = onnx::Squeeze(%2186, %2211), scope: SimpleLSTMNet::/torch.ao.nn.quantized.modu…

longnp91 updated 1 month ago
1
quic/ai-hub-models #65

[MODEL REQUEST] Phi-3

**Is your feature request related to a problem? Please describe.** The famous Phi-3 series models offer SOTA performance, especially in reasoning, math and coding. Microsoft released the models under…

EwoutH updated 1 month ago
2
onnx/onnx #6551

invalid version error when converting llama3 model from opse…

Here is cmd's / code to reproduce: To generate llama3 opset 20 onnx model: ``` pip install optimum[exporters] huggingface-cli login optimum-cli export onnx --model meta-llama/Meta-Llama-3-8B-In…

RyanMetcalfeInt8 updated 1 week ago
5
huggingface/transformers.js #941

Pipeline tries to download all the possible weights even whe…

### System Info tranformers v2.17.2 node v18.20.3 ### Environment/Platform - [ ] Website/web-app - [ ] Browser extension - [X] Server-side (e.g., Node.js, Deno, Bun) - [ ] Desktop app (e.g., Elect…

DavidGOrtega updated 1 month ago
2
huggingface/transformers.js #793

jinaai/jina-clip-v1: support for model names with prefixes

### Model description [jinaai/jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1/tree/main/onnx) ### Prerequisites - [X] The model is supported in Transformers (i.e., listed [here](https://hu…

do-me updated 4 months ago
3
Ki6an/fastT5 #68

flan-t5 support

I have converted google[ flan-t5-small](https://huggingface.co/google/flan-t5-small) using `fastT5.export_and_get_onnx_model` method with quantization enabled by defaults: ```python import sys, os…

loretoparisi updated 1 year ago
5

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for quantized-onnx-models

1000+ results
for quantized-onnx-models