-
Great work! Please provide the ONNX model and an inference demo to facilitate model deployment.
-
Currently session creation fails for onnx models with external data (i.e. any model larger than 2GB) in linux-x64.
For instance, trying to get input/output info for the following model: https://hug…
-
Description
The ONNX Runner application works correctly with the non-quantized version of the Qwen2-0.5B-Instruct model but encounters an error when trying to use the quantized version.
Working Co…
-
-
### Feature request type
sample request
### Is your feature request related to a problem? Please describe
In the documentation there is always a reference to the `Mkldnn` usage but, apparently, the…
-
### Description
ONNX (Open Neural Network Exchange) provides cross-platform compatibility
An operator that can run inference using ONNX models, ideal for deploying machine learning models in a …
-
### Feature request / 功能建议
支持ONNX Model的加载和推理
### Motivation / 动机
在仅有CPU的环境下,我不得不考虑使用CPU去运行Embedding & Reranker 模型,经过测试,使用onnx拥有相较于
Sentence Transformer更快的处理速度
### Your contribution / 您的贡献
https…
-
Hi, I have used optimum from huggingface to convert the model to onnx and this is impossible to open the model in unpaint because it doesn't contain safety_checker folder.
I have tried to change th…
Cop46 updated
1 month ago
-
### Feature request
It look like ONNX now supports 4bit: https://onnx.ai/onnx/technical/int4.html
It would be nice if we could use 4bit models with transformers.js .
### Motivation
Make models f…
-
# Bug Report
### Describe the bug
I am trying to split [Phi-3 INT4 ONNX](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/tree/main/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4) …