-
I want to quantize the CodeQwen model using a custom dataset, but all sample lengths exceed 512. Why doesn't AWQ support sample with lengths longer than 512? Are there any alternative methods for quan…
-
Not sure if this feasable but would love to see a modified save checkpoint node for this somehow so we can save save gguf or exl2 for merging in loras to a gguf or exl2 checkpoint directly.
Check o…
-
Hi everyone,
I'm trying to quantize the YOLOv5n model from [here](https://github.com/ultralytics/yolov5). I'm using the Vitis-AI v3.0 docker with the following code:
```
import pytorch_nndct
i…
-
I have finetuned the llama 3.1 using unsloth. Then, i merged and unloaded the LORA model and pushed to the hub.
Now when i tried quantizing it using:
```
from awq import AutoAWQForCausalLM
qua…
-
I am having trouble with running latest llama 3.1 on openvino. I am trying to use optimum-intel to convert the new model but I always fail with an error. Would be great to have 3.1 also already quanti…
-
In colab, I made some test quantizing some A.I. models.
Once quantized, the files are usually between 4 and 7 gb.
The only thing that seems to work is to move them momentarily on google drive and th…
-
Hello,
Is it possible to load a lora model - peft model with an adapter such as alpaca lora (https://github.com/tloen/alpaca-lora)?
There is a script there, to add peft weights to model but it d…
-
When quantizing deepseek coder models, the tokenizer.json file seems to be throwing an error. This wasn't previously an issue.
Cross posting from [here](https://github.com/ggerganov/llama.cpp/issue…
-
-
The goal of the ticket is to track the support of unknown scales and zero-points. This is required to represent the scales and zero-points, in StableHLO graph, calculated on the fly by the training pr…