-
### System Info
I saved 4bit quantized model
Then, how to load 4bit quantized model directly with 'from_pretrained' ??
It is normal to save Large Models with float16 or float32 or bfloat16.
…
-
### Your current environment
```text
PyTorch version: 2.2.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.4 LTS (x86_64)
GCC vers…
-
After migrating from cdash 2.6 where everything was working as expected, to 3.2, we are facing a strange error when uploading a report from clang tidy containing warnings.
Below is the Build.xml bei…
-
### System Info
1. python: 3.10.12
2. nvcc:
```
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda comp…
-
Hi,
I'm trying to install GPTQModel from the source, but I'm facing this error during import. Could you please specify the versions which are working for you?
My cuda version is 11.7 (via `nvcc -…
-
model_wrapper_cfg=dict(
type='MMDistributedDataParallel', find_unused_parameters=True)
指定 FSDPStrategy 并配置参数
size_based_auto_wrap_policy = partial(
size_based_auto_wrap_policy, min_nu…
-
So I am trying to create a script to train llama3-8b model on a conversational dataset. As it says in the [HF documentation](https://huggingface.co/docs/trl/en/sft_trainer#dataset-format-support):
…
-
### Your current environment
```
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.4 LTS (x86_64)
GCC …
mgoin updated
1 month ago
-
In the GPTQ-for-LLaMa, when creating a new linear layer, the bias option depends on the origin layer
https://github.com/qwopqwop200/GPTQ-for-LLaMa/blob/e985b700f19e670bad9b949cd83056889dd31448/quant/…
-
1. 8*80G够用吗,微调最佳实践脚本里面写的8*80GB,不确定是一个Node还是多个Node?
2. 还有就是想问最佳实践中微调占用的内存是多少?1T内存吗?