-
There's been quite a few changes from 0.1. We should document them for people updating their applications.
-
### What happened?
Hello, i'm having a problem quantizing a `safetensors` model to BF16 using `convert-hf-to-gguf.py`.
I can quantize any model to `f16` or `q8_0`, but i can't convert them to `bf1…
-
### 🐛 Describe the bug
## Description
The output of fully quantized and fake quantized models do not match, with the fully quantized model not matching the expected analytical results for a minima…
-
根據先前的討論, 我有先選 export_quantization_bit=None, 導出無量化模型檔 到export_dir.
然後仍然選擇導出路徑在export_dir, 這次選擇 export_quantization_bit = 4bit
但仍然出現錯誤:
Please merge adapters before quantizing the model.
麻煩您解惑. 謝…
-
- [x] I am using the latest TensorFlow Model Garden release and TensorFlow 2.
- [x] I am reporting the issue to the correct repository. (Model Garden official or research directory)
- [x] I checke…
-
Traceback (most recent call last):
File "/home/admin/workspace/aop_lab/app_source/run_gptq.py", line 89, in
model = AutoGPTQForCausalLM.from_pretrained(args.model_name_or_path, quantize_confi…
-
As written in the README, the results on ImageNet are not good like the paper.
Can you tell me how different the accuracy results are?
-
Good to see you. I'm a newbie.
I am using an Apple M2 laptop. I am going to try to train a model using lora.py. If I run the following.
```bash
python lora.py --train --model Qwen/Qwen2-0.5B-…
-
hi there
according to the documentation
https://github.com/analogdevicesinc/ai8x-training#quantization-aware-training-qat
we can use either QAT or post quantization but can I use both of them? if …
-
Hi,
Could you please provide more precise details on how many GPUs and how much memory are needed for running the inference? For training, I'm assuming based on the readme that it's one A100-SXM-80…