v100 32g 无法推理, torch.cuda.OutOfMemoryError

21-10-4 commented 4 months ago

import torch
from transformers import AutoModel, AutoTokenizer

torch.set_grad_enabled(False)

# init model and tokenizer
# model = AutoModel.from_pretrained('internlm/internlm-xcomposer2-vl-7b', trust_remote_code=True).half().eval()
model = AutoModel.from_pretrained('internlm/internlm-xcomposer2-vl-7b', device_map='cuda', trust_remote_code=True).half().eval()
tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2-vl-7b', trust_remote_code=True)

text = '<ImageHere>Please describe this image in detail.'
image = 'examples/image1.webp'
with torch.cuda.amp.autocast():
  response, _ = model.chat(tokenizer, query=text, image=image, history=[], do_sample=False)
print(response)
#The image features a quote by Oscar Wilde, "Live life with no excuses, travel with no regret,"
# set against a backdrop of a breathtaking sunset. The sky is painted in hues of pink and orange,
# creating a serene atmosphere. Two silhouetted figures stand on a cliff, overlooking the horizon.
# They appear to be hiking or exploring, embodying the essence of the quote.
# The overall scene conveys a sense of adventure and freedom, encouraging viewers to embrace life without hesitation or regrets.

产生报错：

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 31.74 GiB total capacity; 31.25 GiB already allocated; 11.38 MiB free; 31.37 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

我想知道，正常需要多大的显存才能跑动？

yhcao6 commented 4 months ago

你的代码看上去不是最新的，可以试试最新的 example_chat.py, 20G 应该够了

21-10-4 commented 3 months ago

你的代码看上去不是最新的，可以试试最新的 example_chat.py, 20G 应该够了

谢谢，确实成功运行了。

InternLM / InternLM-XComposer

v100 32g 无法推理, torch.cuda.OutOfMemoryError #205