InternLM / InternLM-XComposer

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.
1.92k stars 121 forks source link

InternLM-XComposer2-4KHD-7B 多卡推理报错 #265

Open ly19970621 opened 2 months ago

ly19970621 commented 2 months ago

机器环境:4 * RTX 4090 运行命令:CUDA_VISIBLE_DEVICES=0,1 python examples/example_chat.py --num_gpus 2 出现如下错误: Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.44s/it] Some weights of InternLMXComposer2ForCausalLM were not initialized from the model checkpoint at /home/ai_group/model/internlm-xcomposer2/Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b and are newly initialized: ['vit.vision_tower.vision_model.post_layernorm.bias', 'vit.vision_tower.vision_model.post_layernorm.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Traceback (most recent call last): File "/home/ai_group/liuy026/multi_modality/InternLM-XComposer/examples/example_chat.py", line 26, in <module> model = dispatch_model(model, device_map=device_map) File "/home/ai_group/anaconda3/envs/liuy026-py310/lib/python3.10/site-packages/accelerate/big_modeling.py", line 351, in dispatch_model check_device_map(model, device_map) File "/home/ai_group/anaconda3/envs/liuy026-py310/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1393, in check_device_map raise ValueError( ValueError: The device_map provided does not give any device for the following parameters: plora_glb_GN, plora_sub_GN

ztfmars commented 2 months ago

(1) i have fix the file examples/utils.py as following:

    device_map = {
        'vit': 0,
        'vision_proj': 0,
        'model.tok_embeddings': 0,
        'plora_glb_GN': num_gpus - 1,
        'plora_sub_GN':num_gpus - 1,
        'model.norm': num_gpus - 1,
        'output': num_gpus - 1,
    }

it works for seperating computing into differ gpus. @ly19970621 (2) InternLM-XComposer2-4KHD-7B infer costs too much GPU rams up to almost 80G. i have a A800 and it can only be the infer server, that's too scary!

- code scripts

import sys sys.path.insert(0, '.') sys.path.insert(0, '..') import argparse import torch from modelscope import snapshot_download, AutoModel, AutoTokenizer from examples.utils import auto_configure_device_map

torch.set_grad_enabled(False)

parser = argparse.ArgumentParser() parser.add_argument("--num_gpus", default=1, type=int) parser.add_argument("--dtype", default='fp16', type=str) args = parser.parse_args()

init model and tokenizer

model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b') model = AutoModel.from_pretrained(model_dir, trust_remote_code=True).cuda().eval() tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)

if args.dtype == 'fp16': model.half().cuda() elif args.dtype == 'fp32': model.cuda()

if args.num_gpus > 1: from accelerate import dispatch_model device_map = auto_configure_device_map(args.num_gpus) model = dispatch_model(model, device_map=device_map)

###############

First Round

############### query = 'Illustrate the fine details present in the image' image = 'examples/4khd_example.webp' with torch.cuda.amp.autocast(): response, his = model.chat(tokenizer, query=query, image=image, hd_num=55, history=[], do_sample=False, num_beams=3)

print("*"10) print("-------> first round") print(response) print("*"10)

The image is a vibrant and colorful infographic that showcases 7 graphic design trends that will dominate in 2021. The infographic is divided into 7 sections, each representing a different trend.

Starting from the top, the first section focuses on "Muted Color Palettes", highlighting the use of muted colors in design.

The second section delves into "Simple Data Visualizations", emphasizing the importance of easy-to-understand data visualizations.

The third section introduces "Geometric Shapes Everywhere", showcasing the use of geometric shapes in design.

The fourth section discusses "Flat Icons and Illustrations", explaining how flat icons and illustrations are being used in design.

The fifth section is dedicated to "Classic Serif Fonts", illustrating the resurgence of classic serif fonts in design.

The sixth section explores "Social Media Slide Decks", illustrating how slide decks are being used on social media.

Finally, the seventh section focuses on "Text Heavy Videos", illustrating the trend of using text-heavy videos in design.

Each section is filled with relevant images and text, providing a comprehensive overview of the 7 graphic design trends that will dominate in 2021.

###############

Second Round

############### query1 = 'what is the detailed explanation of the third part.' with torch.cuda.amp.autocast(): response, _ = model.chat(tokenizer, query=query1, image=image, hd_num=55, history=his, do_sample=False, num_beams=3)

print("*"10) print("-------> second round") print(response) print("*"10)

The third part of the infographic is about "Geometric Shapes Everywhere". It explains that last year, designers used a lot of

flowing and abstract shapes in their designs. However, this year, they have been replaced with rigid, hard-edged geometric

shapes and patterns. The hard edges of a geometric shape create a great contrast against muted colors.


so is there any thing wrong for me to use a 7b model like this ?
can you provide a script for  qualified InternLM-XComposer2-4KHD-7B (int4) ?@myownskyW7 
Cloopen-ReLiNK commented 2 months ago

OOM error on one 48G memory card.