ssbuild / aigc_serving

aigc_serving lightweight and efficient Language service model reasoning
Apache License 2.0
23 stars 2 forks source link

当adapter_name: default 打开以后只能只用 default Lora权重的模型 #14

Closed shell-nlp closed 1 year ago

shell-nlp commented 1 year ago
{
"chatglm2-6b": {
        "enable": True,
        "work_mode": "deepspeed",  # one of deepspeed,accelerate,hf
        "workers": [
            {
                "device_id": [0,1,2,3]  # 默认启动一个worker , 使用第一块显卡
            }
        ],

        "auto_quantize": False, # 是否自动量化模型
        "model_config": {
            "model_type": "chatglm2",
            "model_name_or_path": "/home/dev/model/chatglm2-6B",
            #"model_name_or_path": "/home/dev/model/chatglm2-6B-lora-prefix100",
            "use_fast_tokenizer": False,
            "do_lower_case": None,
            "lora": {
                # 多个 lora , adapter_name: lora weight dir
                "default": "/home/dev/project/chatglm2_finetuning/best_ckpt_prefix_100/last",
                # "your_adapter_name": "/data/nlp/pre_models/torch/your_adapter_dir",
            }
        }

    }
}

如果把 adapter_name: default 打开以后,则 无法使用原始的glm2模型,只能使用 default 对应的lora权重的模型, 能否实现 Lora 模型和 原始模型 都可以使用。

ssbuild commented 1 year ago

配置文件增加auto_merge_lora_single字段, 默认为True , 如果需要切换基础模型和lora头, 可以将其改为False, 调用模型 参数 例如 data = { "model": model, "adapter_name": None, # lora头 "prompt": ["你是谁?"], "top_p": 0.8, "temperature": 1.0, "frequency_penalty": 1.01, "stream": stream, "nchar": 1,# stream 字符 "n": 1, # 返回 n 个choices

"stop": ["Observation:","Observation:\n"]

} adapter_name 设置为None 则为调用基础模型,设置对应lora 则调用lora头模型。

shell-nlp commented 1 year ago

建议Model Card 添加 可选 adapter_name的信息,即通过openai.Model.list() 可以查看 可以使用哪些 lora权重

shell-nlp commented 1 year ago

配置文件增加auto_merge_lora_single字段, 默认为True , 如果需要切换基础模型和lora头, 可以将其改为False, 调用模型 参数 例如 data = { "model": model, "adapter_name": None, # lora头 "prompt": ["你是谁?"], "top_p": 0.8, "temperature": 1.0, "frequency_penalty": 1.01, "stream": stream, "nchar": 1,# stream 字符 "n": 1, # 返回 n 个choices # "stop": ["Observation:","Observation:\n"] } adapter_name 设置为None 则为调用基础模型,设置对应lora 则调用lora头模型。

仍然存在 bug, adapter_name 设置为 None 以后, 再切换为 default,无法 切换为 default 回答的结果仍然为 None的结果

ssbuild commented 1 year ago
  1. fix switch lora
    1. model card 增加 adapters字段,如果有可以切换的lora 返回对应列表,否则为None。
shell-nlp commented 1 year ago

报错 aigc_serving-dev/serving/model_handler/base/infer.py", line 296, in switch_lora self.lora_state.merge_adapter() # noqa

原因 aigc_serving-dev/serving/model_handler/base/data_define.py LoraModelState 没有 merge_adapter 方法

image

ssbuild commented 1 year ago

self.lora_state.merge_adapter() # noqa

self.lora_state.merge_adapter() 这一行去掉就可以了。