-
i saw the issue with chatglm2-6b.
it run successfully if with numactl -m 0 -C 0-23.
it run failed if with numactl -m 0 -C 0-31, or 0-47 , or 0-55.
i can be reproduced with INT8_ASYM or 4BIT_…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussi…
-
I found an interesting [model](https://github.com/Picsart-AI-Research/MI-GAN/tree/main) for removing objects from image. I'm going to add it to comparisons-rten repo, I already prepared python code. B…
-
from the issue "https://developer.apple.com/forums/thread/740518 how do we use the computational power of A17 Pro Neural Engine?"
I learn that if i want to inference my mlmodel on my ipad pro with …
-
Hi, I have quantized a YOLOv8 model to int8 parameters. Could you please guide me on how to modify the demo code to make it compatible for running with the int8 quantized model?
![screenshot-202409…
-
Below is the list of issues we are hitting when running [vision int8 models](https://github.com/nod-ai/SHARK-TestSuite/blob/merge-reports/e2eshark/ci_model_lists/shark-test-suite.txt) end to end usin…
-
### Describe the issue
I'm using onnx-runtime to make inference on GPU.
I have installed cuda 10.1, onnxruntime-gpu 1.4.0 and onnx 1.10.2.
The inference is with resnet50-v1-12-int8.onnx mo…
-
has anyone successfully deployed on orin? what's the infer time like?
-
Inspired by a recent back and forth with @gau-nernst we should add some quantized training recipes in AO for small models (600M param range)
Character.ai recently shared that they're working on qua…
-
With:
* https://github.com/pytorch/benchmark/commit/7e1ba8d5983e4ff31cbf79d0f5dec071d11370cd
* https://github.com/pytorch/pytorch/commit/0aa41eb52f7e577cf88e0f1b0adb34167a9ae94b
* https://github.co…