加载checkpoint里面的adapter进行预测，生成的结果跟输入完全一样

beyondguo / LLM-Tuning

Tuning LLMs with no tears💦; Sample Design Engineering (SDE) for more efficient downstream-tuning.

969 stars 99 forks source link

加载checkpoint里面的adapter进行预测，生成的结果跟输入完全一样 #4

Open ZacharyWaseda opened 1 year ago

ZacharyWaseda commented 1 year ago

rt，无法输出新的字符。

如果使用最终的adapter，就没有这个问题

beyondguo commented 1 year ago

贴一下相关代码，不然没法给你debug

ZacharyWaseda commented 1 year ago

device = 'cuda' model_name = "baichuan-7b" adapter_name = "weights/baichuan-7B/checkpoint-500"

model = AutoModelForCausalLM.from_pretrained( model_name, trust_remote_code=True, low_cpu_mem_usage=True, torch_dtype=torch.float16, device_map='auto' )

tokenizer = AutoTokenizer.from_pretrained( model_name, trust_remote_code=True )

model = PeftModel.from_pretrained(model, adapter_name) model.eval() model = model.to(device)

text ="我们说的艾里巴巴公司，指的是" inputs = tokenizer(text, return_tensors='pt') inputs = inputs.to(device) pred = model.generate(**inputs, max_new_tokens=5,repetition_penalty=1.1) res = tokenizer.decode(pred.cpu()[0], skip_special_tokens=True) print(res)

beyondguo commented 1 year ago

我试了试，好像没问题：

你这里把 max_new_tokens设大一点试试？另外，我目前提供的代码里，在checkpoint文件夹里是没有 adapter_config.json 文件的，如果直接加载这个 checkpoint 文件夹，按道理会报错，你需要手动把外面文件夹中的 adapter_config.json 文件给拷贝进去，才能加载 checkpoint 文件。

ZacharyWaseda commented 1 year ago

对的，我手动把外面的adapter_config.json拷贝进去了。我发现使用checkpoint里的adapter预测结果，跟使用最终的adapter预测结果差别很大，后者可以遵从指定微调的格式，而checkpoint里的adapter效果很差，就跟没加载adapter一样。

比如，后者生成的结果是：“否，需勾选【通过】” 用checkpoint生成的结果是空。调大了max_new_tokens也不行呢

zzy347964399 commented 1 year ago

我之前也是这个问题，弄了半天终于弄好了，方法是等模型跑完，加载checkpoint文件外的内容就好了