-
### š Feature request
Quantization is a widely used technique to accelerate models, particularly when using the [torch.compile](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.htmā¦
-
### System Info
GPU - A10
### Who can help?
@Tracin
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [X] An officially supported task in the `ā¦
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### YOLOv8 Component
Export
### Bug
When usā¦
-
### Describe the issue
1. Tried running https://github.com/intel/intel-extension-for-pytorch/blob/release/2.3/examples/cpu/inference/python/llm/run.py to generate the q_config_summary file
2. Thenā¦
-
By using [pytorch-quantization](https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/index.html) i was able to create TensorRT engine models that are (almost) fully int8 andā¦
-
### š Describe the bug
After QAT training, the following error is reported for inference:
NotImplementedError: Could not run 'quantized::linear' with arguments from the 'CPU' backend. This could be ā¦
-
Hello all,
I was curious how actively Bonito is still being used, as I read that Dorado nowadays converted a majority of its neural network code to INT8.
I was interested in experimenting with tā¦
-
Hi,
I have just installed the TensorRT-Model_Optimizer using `pip install "nvidia-modelopt[all]" --no-cache-dir --extra-index-url https://pypi.nvidia.com`. I was then using it to quantize a onnx mā¦
-
can autoawq support int2,int3.int8 quantization?
i see it only support int4 quantization now
-
from the issue "https://developer.apple.com/forums/thread/740518 how do we use the computational power of A17 Pro Neural Engine?"
I learn that if i want to inference my mlmodel on my ipad pro with ā¦