-
Hello everyone,
Following the Diffusion Models Quanization with Model Optimizer, after this command:
`python quantize.py --model sdxl-turbo --format int8 --batch-size 2 --calib-size 32 --collect-met…
-
### Search before asking
- [X] I have searched the HUB [issues](https://github.com/ultralytics/hub/issues) and [discussions](https://github.com/ultralytics/hub/discussions) and found no similar quest…
hanrw updated
14 hours ago
-
I was looking at a past issue (see below) regarding how to set-up my custom dataset for INT8 calibration. I was just wondering what subset of the dataset indicated in the YAML file is used for calibra…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### Ultralytics YOLO Component
Expo…
-
reference: https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/blogs/quantization-in-TRT-LLM.md#performance
![image](https://github.com/user-attachments/assets/1bb20225-3eb2-4641-b5ba-f027…
-
I am trying to follow the instructions for quantization with the yoloV8 model and after reading the 1000 calibration images, tensort errros out with:
ERROR: [TRT]: 10: Could not find any implement…
-
Hey,
I'm looking to perform `int8 * int8 -> fp32`. where at the output stage I dequantise the `int32_t` result into `float` (and then potentially add a bias. I was following the example from https:…
-
GDIT is testing a static analysis tool called Codee (https://www.codee.com/). Codee flagged an issue in the ncdiag code. The code uses the int8(arg) routine to convert its argument into a 64-bit int…
-
## Environment
- OS: [Ubuntu 20.04]
- Hardware (GPU, or instance type): [H800]
## To reproduce
I got a 2G jsonl.gz text file, I tokenized it with data stored as numpy array. Writer is d…
-
The following input example: https://gist.github.com/qedawkins/c620832f96a5c504295f9694cc8956e2
Which is approximately compiled from
```
// int8_t a[144]; // [2 x 8 x 3 x 3]
// int8_t b[144]; //…