-
# Quantize the model
model_prepared = tq.prepare(model_fused)
model_quantized = tq.convert(model_prepared)
# Define the quantization configuration
quant_config = tq.get_default_qconfig('fbge…
-
### 1. System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 22.04
- TensorFlow installation (pip package or built from source): pip package
- TensorFlow library (v…
-
Hi, thank you very much for your very helpful tutorial. I applied all the code from your video to the letter on my script, everything works, but when I get to the conversion part, precisely in: "conve…
-
Hi ,
can help to add runtimeclass on the nimcache and all others crd ?
got this error
Traceback (most recent call last):
File "/usr/local/bin/download-to-cache", line 5, in
from vllm_nv…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and f…
-
### Describe the issue
ORT would be crashed while loading the specific INT4 model.
We can observe the issue on DML EP and CPU EP.
![image](https://github.com/user-attachments/assets/80b2035d-bb1e-4…
-
### 🐛 Describe the bug
python torchchat.py generate stories110M --quant torchchat/quant_config/cuda.json --prompt "It was a dark and stormy night, and"
Using device=cuda Tesla T4
Loading model...…
-
### Search before asking
- [X] I have searched the HUB [issues](https://github.com/ultralytics/hub/issues) and [discussions](https://github.com/ultralytics/hub/discussions) and found no similar quest…
-
Hi, I noticed that you use mixlib in your code
https://github.com/Qcompiler/vllm-mixed-precision/blob/8a941fc4d19fe41e3cce433b40b0f15100d19f02/vllm/model_executor/layers/quantization/mixq4bit.py#L74
…
-
Hi, I am trying the _gap8_app_ on my crazyflie.
I run the _gap_sdk_ starting from the `bitcraze/ai-deck` Dockerfile, installing conda and creating a new environment as described in the readme. Howeve…