Closed ztfmars closed 3 days ago
Hi, @ztfmars Would you mind sharing your lora model with us? We are celebrating the Dragon Boat Festival, and will come back to you next Tuesday.
视觉部分的预处理是否跟xcomposer 4khd一致呢?如果不一样的话就不能直接用LMDeploy来处理了。
我们会复用VLM模型视觉部分的预处理,对于 xcomposer-4khd,他的视觉部分forward是需要 glb_GN, sub_GN 这两个权重的,你可以看下合并后的权重index.json文件是否跟原xcomposer 4khd一致。
https://huggingface.co/internlm/internlm-xcomposer2-4khd-7b/blob/main/build_mlp.py#L76 https://huggingface.co/internlm/internlm-xcomposer2-4khd-7b/blob/main/pytorch_model.bin.index.json#L553-L554
fineturne xcomposer with fineture_lora.sh and merget the adapter with pretrain "internlm-xcomposer2-4khd"
权重已上传: merged lora: https://openxlab.org.cn/models/detail/ztfmars/xcomposer_lora_3e full train: https://openxlab.org.cn/models/detail/ztfmars/nuclear_blueprint_assistant
看config好像没啥太大区别,都有 glb_GN, sub_GN 麻烦帮忙看一下,这个训练版本希望能支持部署,或者给出一些修改意见,多谢 @lvhan028 @irexyc
我估计可能是这个地方的原因: https://github.com/InternLM/lmdeploy/blob/v0.4.2/lmdeploy/vl/model/xcomposer2.py#L63-L71
huggingface上面xcomposer 4khd 里面的architectures
是 InternLM2ForCausalLM
,LMDeploy 0.4.2是按照这个来判断到底是用_forward_7b
还是_forward_4khd_7b
。从你的log里面看用的是_forward_7b。
你可以把config.json里面的architecture改成InternLM2ForCausalLM
试试 (估计你的可能是 InternLMXComposer2ForCausalLM)
谢谢,修改这里可以的,不再会有missing 2 required positional arguments: 'glb_GN' and 'sub_GN'。
想问一下另一个问题,为啥我这么通过代码生成之后,config.json显示"attn_implementation": "eager",而不是"attn_implementation": "flash_attention_2",
我是在openmmlab/lmdeploy:v0.4.2镜像的基础上再安装了flash_attn-2.5.6+cu118torch2.1cxx11abiFALSE-cp38-cp38-linux_x86_64.whl,进容器之后import flash_attn也没有报错,请问怎么修改一下呢?
为啥我这么通过代码生成之后,config.json显示"attn_implementation": "eager",
你是说save_pretrained生成的config.json么?这个好像跟我们没什么关系。
BTW,用LMDeploy推理xcomposer2的话,LLM部分用的是TurboMind引擎,不会去管attn_implementation
的值是什么。
那继续问一下,因为我看lmdeploy部署服务时可以选择TurboMind或者Pytorch其中一个引擎,就是参数--backend {pytorch,turbomind};
假设我用pytorch作为后端,那么这里attn_implementation会不会有影响呢?
不是所有的模型都支持两个后端。对于VLM模型,大部分只有TurboMind后端支持。pytorch目前支持cogvlm,llava。
具体的可以看下下面两个文件中的 SUPPORTED_ARCHS
。pytorch引擎也不会管attn_implementation是什么值
https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/pytorch/supported_models.py https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/turbomind/supported_models.py
好的,谢谢~
This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.
This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.
Checklist
Describe the bug
i have fineturne xcomposer with fineture_lora.sh and merget the adapter with pretrain "internlm-xcomposer2-4khd". i have tested on the xcomposer-4khd gradio demo - link. the merged weight can work well.
but when i use the same weight to test on the lmdeploy gradio demo , it occured an error the error:
i have tried the pretrained weight from Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b, it can be used well in lmdeploy gradio demo. i am really confused how to get my new lora weights work.
my code can be list :
my merge code for merging xcomposer llm and lora adapter:
python3 merge_peft_adapter.py \ --adapter_model_name=/home/fusionai/project/internllm_demo/xcomposer_test/train/4khd_3e_mixed_all \ --base_model_name=/home/fusionai/.cache/modelscope/hub/Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b \ --output_name=4khd_3e_mixed_all
look forward to you reply!
Reproduction
python3 merge_peft_adapter.py \ --adapter_model_name=/home/fusionai/project/internllm_demo/xcomposer_test/train/4khd_3e_mixed_all \ --base_model_name=/home/fusionai/.cache/modelscope/hub/Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b \ --output_name=4khd_3e_mixed_all
python gradio_demo_lmdeploy.py
Environment
Error traceback