Closed wizardforcel closed 12 months ago
epoch: 0, step: 0, ques: "[Round 1]\n\n问:请按照以下关键词生成广告:类型#裙*材质#针织*颜色#纯色*风格#复古*风格#文艺*风 格#简约*图案#格子*图案#纯色*图案#复 古*裙型#背带裙*裙长#连衣裙*裙领型#半高领\n\n答:", ans: "这款[[BRAND]]针织两件套连衣裙,简约的纯色半高领针织上衣,修饰着颈部线,尽显优雅气质。同时搭配叠穿起一条背带式的复古格纹裙,整体散发着一股怀旧的时髦魅力,很是文艺范。", loss: 7.67578125 epoch: 1, step: 1, ques: "[Round 1]\n\n问:请按照以下关键词生成广告:类型#裙*材质#针织*颜色#纯色*风格#复古*风格#文艺*风 格#简约*图案#格子*图案#纯色*图案#复 古*裙型#背带裙*裙长#连衣裙*裙领型#半高领\n\n答:", ans: "这款[[BRAND]]针织两件套连衣裙,简约的纯色半高领针织上衣,修饰着颈部线,尽显优雅气质。同时搭配叠穿起一条背带式的复古格纹裙,整体散发着一股怀旧的时髦魅力,很是文艺范。", loss: nan
No response
from transformers import AutoConfig, AutoTokenizer, AutoModel import torch import re import json from typing import * def combine_prompt_args(prompt: str, args: Dict[str, Any]): return re.sub(r'{(\w+)}', lambda g: args.get(g.group(1), ''), prompt) base_path = r'../src' model_path = r'd:/src/chatglm2-6b-int4/model/pytorch_model.bin' save_path = r'd:/src/chatglm2-6b-int4/model/pytorch_model_trained.bin' ds = [{ "content": "类型#裙*材质#针织*颜色#纯色*风格#复古*风格#文艺*风格#简约*图案#格子*图案#纯色*图案#复 古*裙型#背带裙*裙长#连衣裙*裙领型#半高领", "summary": "这款[[BRAND]]针织两件套连衣裙,简约的纯色半高领针织上衣,修饰着颈部线,尽显优雅气质。同时搭配叠穿起一条背带式的复古格纹裙,整体散发着一股怀旧的时髦魅力,很是文艺范。" }] ques_prompt = "请按照以下关键词生成广告:{content}" ans_prompt = "{summary}" self_args = { 'lr': 5e-7, 'n_epoch': 5, 'save_step': 30, } conf = AutoConfig.from_pretrained(base_path, trust_remote_code=True) tokenizer = AutoTokenizer.from_pretrained(base_path, trust_remote_code=True) # 分别加载结构和参数 model = AutoModel.from_config(conf, trust_remote_code=True) stdc = torch.load(model_path) model.load_state_dict(stdc, False) model = model.cuda() optimizer = torch.optim.AdamW(model.parameters(), lr=self_args['lr']) step = 0 for epoch in range(self_args['n_epoch']): for i, dsi in enumerate(ds): # 组装问答和问答提示词 ques = tokenizer.build_prompt(combine_prompt_args(ques_prompt, dsi)) ans = combine_prompt_args(ans_prompt, dsi) # 问答转成问答 ID ques_ids = tokenizer.encode(text=ques, add_special_tokens=True, truncation=True) ans_ids = tokenizer.encode(text=ans, add_special_tokens=False, truncation=True) # 问答 ID 拼接输入 ID input_ids = ques_ids + ans_ids + [tokenizer.eos_token_id] output_ids = [tokenizer.pad_token_id] * len(ques_ids) + ans_ids + [tokenizer.eos_token_id] # 忽略 <PAD> output_ids = [(oid if oid != tokenizer.pad_token_id else -100) for oid in output_ids] # 因为批量大小为 1,无需填充 optimizer.zero_grad() input_ids = torch.tensor([input_ids]).cuda() output_ids = torch.tensor([output_ids]).cuda() loss = model.forward(input_ids=input_ids, labels=output_ids, return_dict=True).loss loss.backward() print(f'epoch: {epoch}, step: {step}, ques: {json.dumps(ques, ensure_ascii=False)}, ans: {json.dumps(ans, ensure_ascii=False)}, loss: {loss}') optimizer.step() # 一定步骤保存权重 if step % self_args['save_step'] == 0: torch.save(model.state_dict(), save_path) step += 1 torch.save(model.state_dict(), save_path)
- OS: win10 - Python: 3.10 - Transformers: 4.31.0 - PyTorch: 2.0.1+cu117 - CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : `True`
Is there an existing issue for this?
Current Behavior
Expected Behavior
No response
Steps To Reproduce
Environment
Anything else?
No response