-
**Describe the bug**
gemma-2-9b-it-gptq-4bit CUDA OOM on RTX 3090
**GPU Info**
```
Sun Aug 4 02:35:35 2024
+-----------------------------------------------------------------------…
-
### Model Series
Qwen2.5
### What are the models used?
Qwen2.5-72B-Instruct-GPTQ-Int4
### What is the scenario where the problem happened?
Qwen2.5-72B-Instruct-GPTQ-Int4 params error
### Is this…
-
### System Info / 系統信息
# Name Version Build Channel
accelerate 1.0.1 pypi_0 pypi
aiofiles 23.2.1 …
-
```
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
/opt/venv/lib/python3.11/site-packages/flash_attn_2_cuda.cp…
-
Hi, thank you for making this.
I receive this error on first launch of Comfyui after installing and following directions. I am on windows 11, python 3.11.9, cuda 12.4:
```
Traceback (most recent …
-
**Describe the bug**
Quantizing mlp.down_proj in layer 0 of 125: 0%| | 0/126 [00:44
-
**Describe the bug**
I am trying to quantize Llama3.1 using GPTQ but encounter an error where tensors are on CPU and GPU.
But this used to work for Llama3 on the exact same hardware, with the same s…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
--------------------------------- -------------------- --------------------------
accelerate …
-
### System Info / 系統信息
accelerate 0.29.3
aiobotocore 2.7.0
aiofiles 23.2.1
aiohttp …
-
### System Info / 系統信息
Package Version
----------------------------- ------------
accelerate 0.33.0
aiobotocore 2.7.0
aiofiles …