yongzhuo / ChatGLM2-SFT

ChatGLM2-6B微调, SFT/LoRA, instruction finetune
Apache License 2.0
106 stars 11 forks source link

推理中报错:ValueError #1

Closed majortsui closed 1 year ago

majortsui commented 1 year ago

您好!感谢您的分享!我在训练后推理过程中报如下错误:

Traceback (most recent call last):
  File "chatglm2_6b/ft_chatglm2/predict.py", line 211, in <module>
    res = predict(data_dict)
  File "chatglm2_6b/ft_chatglm2/predict.py", line 189, in predict
    generation_output = model.generate(
  File "/home/ln01/miniconda3/envs/ChatGLM2/lib/python3.8/site-packages/peft/peft_model.py", line 731, in generate
    outputs = self.base_model.generate(**kwargs)
  File "/home/ln01/miniconda3/envs/ChatGLM2/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/ln01/miniconda3/envs/ChatGLM2/lib/python3.8/site-packages/transformers/generation/utils.py", line 1572, in generate
    return self.sample(
  File "/home/ln01/miniconda3/envs/ChatGLM2/lib/python3.8/site-packages/transformers/generation/utils.py", line 2616, in sample
    model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
  File "/media/ln01/2t/usr/cmz/ChatGLM2-SFT/chatglm2_6b/models/chatglm/modeling_chatglm.py", line 1124, in prepare_inputs_for_generation
    mask_positions.append(seq.index(mask_token))
ValueError: 150000 is not in list

请问是什么原因呢?

majortsui commented 1 year ago

推理过程中的输出向量也很多全零向量,我不知道这样是不是正常的?如下:

'base_model.model.transformer.layers.27.attention.query_key_value.weight', torch.float16, False, tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]], device='cuda:0', dtype=torch.float16)
yongzhuo commented 1 year ago

导入包的路径只有train.py是对的,其他的都没有改, 其实就是将ft_chatglm改为ft_chatglm2,chatglm改为chatglm2 将 from chatglm2_6b.ft_chatglm.config import CUDA_VISIBLE_DEVICES, USE_TORCH, CPU_NUMS from chatglm2_6b.models.chatglm.modeling_chatglm import ChatGLMForConditionalGeneration, ChatGLMConfig from chatglm2_6b.models.chatglm.tokenization_chatglm import ChatGLMTokenizer from chatglm2_6b.ft_chatglm.config import PATH_MODEL_PRETRAIN, DATA_PATH, MODEL_SAVE_DIR, REPO_ID from chatglm2_6b.ft_chatglm.config import MICRO_BATCH_SIZE, BATCH_SIZE, GRADIENT_ACCUMULATION_STEPS from chatglm2_6b.ft_chatglm.config import LEARNING_RATE, EPOCHS, SAVE_STEPS, VAL_SET_SIZE, TARGET_MODULES from chatglm2_6b.ft_chatglm.config import MAX_LENGTH_Q, MAX_LENGTH_A, MAX_LENGTH_QA from chatglm2_6b.ft_chatglm.config import LORA_DROPOUT, LORA_ALPHA, LORA_R from chatglm2_6b.ft_chatglm.config import USE_CUDA 改为 from chatglm2_6b.ft_chatglm2.config import CUDA_VISIBLE_DEVICES, USE_TORCH, CPU_NUMS from chatglm2_6b.models.chatglm2.modeling_chatglm import ChatGLMForConditionalGeneration, ChatGLMConfig from chatglm2_6b.models.chatglm2.tokenization_chatglm import ChatGLMTokenizer from chatglm2_6b.ft_chatglm2.config import PATH_MODEL_PRETRAIN, DATA_PATH, MODEL_SAVE_DIR, REPO_ID from chatglm2_6b.ft_chatglm2.config import MICRO_BATCH_SIZE, BATCH_SIZE, GRADIENT_ACCUMULATION_STEPS from chatglm2_6b.ft_chatglm2.config import LEARNING_RATE, EPOCHS, SAVE_STEPS, VAL_SET_SIZE, TARGET_MODULES from chatglm2_6b.ft_chatglm2.config import MAX_LENGTH_Q, MAX_LENGTH_A, MAX_LENGTH_QA from chatglm2_6b.ft_chatglm2.config import LORA_DROPOUT, LORA_ALPHA, LORA_R from chatglm2_6b.ft_chatglm2.config import USE_CUDA

majortsui commented 1 year ago

感谢回复!目前的问题是:

Traceback (most recent call last):
  File "chatglm2_6b/ft_chatglm2/predict.py", line 211, in <module>
    res = predict(data_dict)
  File "chatglm2_6b/ft_chatglm2/predict.py", line 189, in predict
    generation_output = model.generate(
  File "/home/ln01/miniconda3/envs/ChatGLM2/lib/python3.8/site-packages/peft/peft_model.py", line 731, in generate
    outputs = self.base_model.generate(**kwargs)
  File "/home/ln01/miniconda3/envs/ChatGLM2/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/ln01/miniconda3/envs/ChatGLM2/lib/python3.8/site-packages/transformers/generation/utils.py", line 1572, in generate
    return self.sample(
  File "/home/ln01/miniconda3/envs/ChatGLM2/lib/python3.8/site-packages/transformers/generation/utils.py", line 2660, in sample
    raise ValueError("If `eos_token_id` is defined, make sure that `pad_token_id` is defined.")
ValueError: If `eos_token_id` is defined, make sure that `pad_token_id` is defined.
yongzhuo commented 1 year ago
  1. 新版本的transformers中GenerationConfig需要pad_token_id, eos_token_id; 在predict.py/post_api.py/evaluation.py中

    generation_config = GenerationConfig(
        temperature=0.8,
        top_p=0.8,
        top_k=50,
        num_beams=1,
        do_sample=True,
        penalty_alpha=1.0,
        max_new_tokens=512,
        pad_token_id=ID_PAD,
        eos_token_id=ID_EOS,
    )
  2. models/chatglm2/modeling_chatglm.py也有bug, class ChatGLMForConditionalGeneration中的forward函数没有定义seq_length,你在 if full_attention_mask is None: 前面一行加一句python batch_size, seq_length = input_ids.shape

majortsui commented 1 year ago

非常感谢!能够运行了! 但是模型推理有很严重的复读现象,我在复现其他chatglm2项目时也遇到了这个问题。 同时即使使用训练集进行推理效果也不理想。 如果将modeling_chatglm.py更改后再进行训练是否能改善这个问题呢?

yongzhuo commented 1 year ago

说明: base_model.model.transformer.layers.27.attention.query_key_value.weight全0看着像是是没加载好。 base_model.model.transformer.encoder.layers.27.self_attention.query_key_value.lora_B.default.weight全为零大概率没训练好。

模型推理有很严重的复读现象,我修改了以下代码,感觉好了一些,从完全不能看到 语句通顺了

  1. torch有的版本不支持bool的减法运算, 改为int后运算, 即modeling_chatglm.py中的 full_attention_mask -= padding_mask.unsqueeze(-1) - 1 改为 full_attention_mask = full_attention_mask.long() - padding_mask.unsqueeze(-1).long() - 1

2.严格按照官方的prompt来实测可行,但是似乎终止符不太对。 1e3eed43f432a435ad2482961c31906

3.EOS改为2, <\s>,而不是eop 4.已更新源码,你可以重新clone试试

majortsui commented 1 year ago

好的,谢谢!