Use GPU instead of CPU - Githubissues

gemek123 commented 1 year ago

Hi, I tried to run this project on my PC. It was very slow, because it used CPU insted of GPU. There was info: You are using config.init_device='cpu', but you can also use config.init_device="meta" with Composer + FSDP for fast initialization

I found that I can change config.init_device='cpu' to config.init_device='meta' in model config .cache/huggingface/hub/models--anas-awadalla--mpt-1b-redpajama-200b/snapshots/8bc4eba452b5a5330f81975a761e4a59c851beea But I got error: You are calling .generate() with the input_ids being on a device type different than your model's device. input_ids is on cpu, whereas the model is on meta. You may experience unexpected behaviors or slower generation. Please make sure that you have put input_ids to the correct device by calling for example input_ids = input_ids.to('meta') before running .generate(). RuntimeError: Tensor on device meta is not on the expected device cpu! How can I set up this project to use GPU?

anas-awadalla commented 1 year ago

Hello!

I believe you need to move your inputs on to gpu as well. Have you tried this "Please make sure that you have put input_ids to the correct device by calling for example input_ids = input_ids.to('meta')"?

The way I would go about running this on a GPU is to move the model to your desired device after initialization. This should look something like this:

from open_flamingo import create_model_and_transforms

model, image_processor, tokenizer = create_model_and_transforms(
    clip_vision_encoder_path="ViT-L-14",
    clip_vision_encoder_pretrained="openai",
    lang_encoder_path="anas-awadalla/mpt-1b-redpajama-200b",
    tokenizer_path="anas-awadalla/mpt-1b-redpajama-200b",
    cross_attn_every_n_layers=1
)

# Move model to GPU 0
model.to(0)

When you are generating you should move your inputs to the same device as well. This can be done using the following code:

generated_text = model.generate(
    vision_x=vision_x.to(0),
    lang_x=lang_x["input_ids"].to(0),
    attention_mask=lang_x["attention_mask"].to(0),
    max_new_tokens=20,
    num_beams=3,
)

gemek123 commented 1 year ago

Thanks a lot. It works perfect.

LarsDoorenbos commented 1 year ago

Hey, I am running into an issue trying to load the model on my GPU. If I set init_device to 'meta' in ~/.cache/huggingface/hub/models--anas-awadalla--mpt-7b/snapshots/b772e556c8e8a17d087db6935e7cd019e5eefb0f/config.json, my code crashes in the model = model.to("cuda") call with the following error.

Here is my code:


from PIL import Image
import matplotlib.pyplot as plt

from open_flamingo import create_model_and_transforms

from huggingface_hub import hf_hub_download
import torch

model, image_processor, tokenizer = create_model_and_transforms(
    clip_vision_encoder_path="ViT-L-14",
    clip_vision_encoder_pretrained="openai",
    lang_encoder_path="anas-awadalla/mpt-7b",
    tokenizer_path="anas-awadalla/mpt-7b",
    cross_attn_every_n_layers=4
)

model = model.to("cuda")

And this is the error:

Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████| 3/3 [00:33<00:00, 11.02s/it]
Flamingo model initialized with 1384781840 trainable parameters
Traceback (most recent call last):
  File "/home/ldoorenbos/llm-ood/test.py", line 18, in <module>
    model = model.to("cuda")
            ^^^^^^^^^^^^^^^^
  File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1145, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 820, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1143, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: Cannot copy out of meta tensor; no data!

Do you know what causes this? How can I load the OpenFlamingo model on GPU?

Wangzi06 commented 3 months ago

Did you every solve it I am running into the same issue

mlfoundations / open_flamingo

Use GPU instead of CPU #228