Closed zihui-debug closed 7 months ago
@zihui-debug
def __init__(self, code_path, num_gpus=1):
self.code_path = code_path
# 加载 tokenizer 和模型
tokenizer = AutoTokenizer.from_pretrained(code_path, trust_remote_code=True)
self.chat_model = AutoModelForCausalLM.from_pretrained(code_path, device_map='cuda', trust_remote_code=True).half().eval()
self.chat_model.tokenizer = tokenizer
# 如果有多个 GPU,使用 DataParallel
if torch.cuda.device_count() > 1:
print(f"Let's use {torch.cuda.device_count()} GPUs!")
self.chat_model = torch.nn.DataParallel(self.chat_model)
self.chat_model.to('cuda') # 确保模型被移到 CUDA
stop_words_ids = [92542]
self.stopping_criteria = get_stopping_criteria(stop_words_ids)
set_random_seed(1234)
这样试一试,我的两块GPU可以了
@wp1811983038 我是在from_pretrained加载模型那里显存不够,24G的卡,你的GPU是多大的呢
两块32G的,就成功了一次,又就出现了各种报错
@zihui-debug @wp1811983038 We have support load the ShareCaptioner model on multiple GPU, please refer to https://github.com/InternLM/InternLM-XComposer/blob/main/projects/ShareGPT4V/tools/share-cap_batch_infer.py and set --num_gpus
Free free to reopen this issue if you have any problems
how can I load the ShareCaptioner model on multiple GPU? I have used
device_map='auto'
but only one GPU works. I also usedaccelerate.init_empty_weights()
and this is the code:The error output:
Is there any guidance? thank you