THUDM / ChatGLM3

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
Apache License 2.0
13.33k stars 1.55k forks source link

Lora的layer命名问题导致lora weights无法被tensorRT_LLM的hf_lora_convert.py处理 #1098

Closed wangqy1216 closed 5 months ago

wangqy1216 commented 5 months ago

如题目描述,

用fintune_hf.py处理的lora weights的layers的命名为: 'base_model.model.transformer.encoder.layers.0.self_attention.query_key_value.lora_A.weight', 'base_model.model.transformer.encoder.layers.0.self_attention.query_key_value.lora_B.weight', 'base_model.model.transformer.encoder.layers.1.self_attention.query_key_value.lora_A.weight', 'base_model.model.transformer.encoder.layers.1.self_attention.query_key_value.lora_B.weight',

而https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/hf_lora_convert.py中提供的lora weights example中命名为 'base_model.model.model.layers.0.self_attn.q_proj.lora_A.weight', 'base_model.model.model.layers.0.self_attn.q_proj.lora_B.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_A.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_B.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_A.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_B.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_A.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_B.weight',

从而导致无法convert。

zRzRzRzRzRzRzR commented 5 months ago

trt demo只能转换原始和全参模型