InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
https://lmdeploy.readthedocs.io/en/latest/
Apache License 2.0
3.55k stars 318 forks source link

能否支持glm-4v-9b模型 #1916

Open liyuan1208 opened 3 weeks ago

liyuan1208 commented 3 weeks ago

Motivation

尝试适配glm-4v-9b模型(其实是13.9b,视觉部分有4.9B),发现glm4v里面对输入的position_ids做了特殊处理: new_input_embeds.append(torch.cat( (inputs_embeds[i, :boi_token_pos], images_features[i], inputs_embeds[i, eoi_token_pos + 1:]))) new_position_ids.append(torch.cat( (position_ids[i, :boi_token_pos + 1], position_ids[i, boi_token_pos + 1].repeat(num_patches), position_ids[i, eoi_token_pos:]) )) 其将视觉特征部分的position_ids统一设定为同一个值,在计算RoPE时会用到。 turbomind引擎好像没有修改position_ids的接口。我们对glm-4v在我们的场景里全参数微调后效果是开源模型里面最好的,希望官方能支持glm-4v-9b模型

Related resources

GLM-4模型链接: https://github.com/THUDM/GLM-4

Additional context

No response

RunningLeon commented 3 weeks ago

@liyuan1208 hi, glm-4v-9b will be supported by lmdeploy's pytorch engine. Will update once the pr is created.

danxuan2022 commented 2 weeks ago

@liyuan1208 hi, glm-4v-9b will be supported by lmdeploy's pytorch engine. Will update once the pr is created.

大概要多久可以上线,期待😚

danxuan2022 commented 2 weeks ago

@liyuan1208 hi, glm-4v-9b will be supported by lmdeploy's pytorch engine. Will update once the pr is created.

个人感觉哈,如果比vllm先支持会吸引一大波用户 。。。

RunningLeon commented 2 weeks ago

@danxuan2022 hi, you could try this PR https://github.com/InternLM/lmdeploy/pull/1947

danxuan2022 commented 1 week ago

@danxuan2022 hi, you could try this PR #1947

👍 👍 这就去试一下~

danxuan2022 commented 1 week ago

@danxuan2022 hi, you could try this PR #1947

刚试了一下哈,通过modelscope下载的glm-4v-9b部署是正常的哈~

但是glm-4v-9b模型微调后再使用lmdeploy会报错,辛苦帮忙看下

微调后的模型merge lora CUDA_VISIBLE_DEVICES=0 swift export --ckpt_dir '/home/admin/ai-testing-platform/data/dx/swift/glm4v_output/glm4v-9b-chat/v0-20240715-152012/checkpoint-50' --merge_lora true

将转换后的模型进行部署 lmdeploy serve api_server /home/admin/ai-testing-platform/data/dx/swift/glm4v_model --server-port 23333

报错信息很奇怪,看起来像是环境的问题,但是我在相同环境下部署modelscope下载的glm-4v-9b是正常的,所以我猜测大概率不是环境问题,报错信息如下

2024-07-15 16:44:07,336 - lmdeploy - WARNING - Try to run with pytorch engine because /home/admin/ai-testing-platform/data/dx/swift/glm4v_model is notexplicitly supported by lmdeploy. /usr/bin/ld: skipping incompatible /lib/libcuda.so when searching for -lcuda /usr/bin/ld: skipping incompatible /lib/libcuda.so when searching for -lcuda /usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../libcuda.so when searching for -lcuda /usr/bin/ld: skipping incompatible //lib/libcuda.so when searching for -lcuda /usr/bin/ld: skipping incompatible //usr/lib/libcuda.so when searching for -lcuda /usr/bin/ld: cannot find -lcuda collect2: error: ld returned 1 exit status 2024-07-15 16:44:08,558 - lmdeploy - ERROR - CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmpz6ods113/main.c', '-O3', '-I/home/admin/ai-testing-platform/anaconda3/envs/dx_deploy/lib/python3.11/site-packages/triton/common/../third_party/cuda/include', '-I/home/admin/ai-testing-platform/anaconda3/envs/dx_deploy/include/python3.11', '-I/tmp/tmpz6ods113', '-shared', '-fPIC', '-lcuda', '-o', '/tmp/tmpz6ods113/_add_kernel.cpython-311-x86_64-linux-gnu.so', '-L/lib64', '-L/lib', '-L/lib']' returned non-zero exit status 1. 2024-07-15 16:44:08,558 - lmdeploy - ERROR - test failed! Please ensure it has been installed correctly.

RunningLeon commented 1 week ago

@danxuan2022 看起来像是没有安装好环境,triton安装有问题,你重新安装 triton==2.1.0试试