-
I tryed to modify your example code to run this model on lowvram card by BNB 4bit or 8bit quantization config.
While use bnb 4bit config like below:
```python
qnt_config = BitsAndBytesConfig(load…
-
-
model parameter quantization is missing in tutorial code.
Please add.
---------------------------------------------------------------------------
KeyError Trace…
-
Someone can share the details of quantizes the mobilenetv2 or resnet50?
-
### OpenVINO Version
2024.2.0-15519-5c0f38f83f6-releases/2024/2
### Operating System
Ubuntu 22.04 (LTS)
### Device used for inference
CPU
### OpenVINO installation
PyPi
### Programming Languag…
-
@jiqing-feng I am going to answer gptqmodel specifics here.
When you mean `transformers` integration you mean `AutoModel` loading of quantized models correct? Hf transformers moved all quantizati…
-
When finetuning llama-3.1-8b or mistral-nemo-12b (only did those, doesn't seem to depend on the model), unsloth uploads the F16 result to huggingface too even tho my script should only upload the Q4_K…
-
Hi @CY-CHENYUE,
Thank you so much for your fantastic work on integrating Molmo into ComfyUI — it's greatly appreciated!
I wanted to ask if you could possibly enable the support of non-quantized …
-
I found a statement which said **3. Better support for vision transformers.** in link https://nvidia.github.io/TensorRT-Model-Optimizer/guides/_onnx_quantization.html.
I'm working on quantizing VIT n…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### Ultralytics YOLO Component
Val
…