Closed gemek123 closed 1 year ago
Hello!
I believe you need to move your inputs on to gpu as well. Have you tried this "Please make sure that you have put input_ids to the correct device by calling for example input_ids = input_ids.to('meta')"
?
The way I would go about running this on a GPU is to move the model to your desired device after initialization. This should look something like this:
from open_flamingo import create_model_and_transforms
model, image_processor, tokenizer = create_model_and_transforms(
clip_vision_encoder_path="ViT-L-14",
clip_vision_encoder_pretrained="openai",
lang_encoder_path="anas-awadalla/mpt-1b-redpajama-200b",
tokenizer_path="anas-awadalla/mpt-1b-redpajama-200b",
cross_attn_every_n_layers=1
)
# Move model to GPU 0
model.to(0)
When you are generating you should move your inputs to the same device as well. This can be done using the following code:
generated_text = model.generate(
vision_x=vision_x.to(0),
lang_x=lang_x["input_ids"].to(0),
attention_mask=lang_x["attention_mask"].to(0),
max_new_tokens=20,
num_beams=3,
)
Thanks a lot. It works perfect.
Hey, I am running into an issue trying to load the model on my GPU. If I set init_device to 'meta' in ~/.cache/huggingface/hub/models--anas-awadalla--mpt-7b/snapshots/b772e556c8e8a17d087db6935e7cd019e5eefb0f/config.json
, my code crashes in the model = model.to("cuda")
call with the following error.
Here is my code:
from PIL import Image
import matplotlib.pyplot as plt
from open_flamingo import create_model_and_transforms
from huggingface_hub import hf_hub_download
import torch
model, image_processor, tokenizer = create_model_and_transforms(
clip_vision_encoder_path="ViT-L-14",
clip_vision_encoder_pretrained="openai",
lang_encoder_path="anas-awadalla/mpt-7b",
tokenizer_path="anas-awadalla/mpt-7b",
cross_attn_every_n_layers=4
)
model = model.to("cuda")
And this is the error:
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████| 3/3 [00:33<00:00, 11.02s/it]
Flamingo model initialized with 1384781840 trainable parameters
Traceback (most recent call last):
File "/home/ldoorenbos/llm-ood/test.py", line 18, in <module>
model = model.to("cuda")
^^^^^^^^^^^^^^^^
File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1145, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 820, in _apply
param_applied = fn(param)
^^^^^^^^^
File "/home/ldoorenbos/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: Cannot copy out of meta tensor; no data!
Do you know what causes this? How can I load the OpenFlamingo model on GPU?
Did you every solve it I am running into the same issue
Hi, I tried to run this project on my PC. It was very slow, because it used CPU insted of GPU. There was info:
You are using config.init_device='cpu', but you can also use config.init_device="meta" with Composer + FSDP for fast initialization
I found that I can change
config.init_device='cpu'
toconfig.init_device='meta'
in model config.cache/huggingface/hub/models--anas-awadalla--mpt-1b-redpajama-200b/snapshots/8bc4eba452b5a5330f81975a761e4a59c851beea
But I got error:You are calling .generate() with the input_ids being on a device type different than your model's device. input_ids is on cpu, whereas the model is on meta. You may experience unexpected behaviors or slower generation. Please make sure that you have put input_ids to the correct device by calling for example input_ids = input_ids.to('meta') before running .generate().
RuntimeError: Tensor on device meta is not on the expected device cpu!
How can I set up this project to use GPU?