huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
https://huggingface.co/docs/accelerate
Apache License 2.0
7.8k stars 946 forks source link

Error indicating Model loading on meta device #2103

Closed RonanKMcGovern closed 11 months ago

RonanKMcGovern commented 11 months ago

System Info

- `transformers` version: 4.35.0.dev0
- Platform: Linux-5.4.0-153-generic-x86_64-with-glibc2.35
- Python version: 3.10.6
- Huggingface_hub version: 0.17.3
- Safetensors version: 0.4.0
- Accelerate version: 0.25.0.dev0
- Accelerate config:    not found
- PyTorch version (GPU?): 2.0.1+cu118 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: Yes A6000
- Using distributed or parallel set-up in script?: Only one GPU, so shouldn't be relevant, but somehow the model is getting loaded to cpu at least in part.

Information

Tasks

Reproduction

import torch
from transformers import AutoModelForCausalLM

model_id  = "tiiuae/falcon-7b"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    load_in_4bit=True,
    torch_dtype=torch.bfloat16,
)

for n, p in model.named_parameters():
    if p.device.type == "meta":
        print(f"{n} is on meta!")

Leads to this error message:

WARNING:root:Some parameters are on the meta device device because they were offloaded to the .
WARNING:root:Some parameters are on the meta device device because they were offloaded to the cpu/disk.

even though there are no params on the meta device.

Expected behavior

Expect loading to the single A6000 GPU and not use the CPU

SunMarc commented 11 months ago

Hi @RonanKMcGovern, thanks for reporting this issue ! This will be fixed with the PR above.