telexyz / GPT4VN

Ai cũng có thể tự tạo chatbot bằng huấn luyện chỉ dẫn, với 12G GPU (RTX 3060) và khoảng vài chục MB dữ liệu
108 stars 35 forks source link

Lỗi ko save final model khi dùng deepspeed #3

Closed tiendung closed 1 year ago

tiendung commented 1 year ago

Google: "deepspeed/runtime/engine.py" _load_checkpoint untimeError('Error(s) in loading state_dict for

  File "lora.py", line 176, in train
    trainer.train(resume_from_checkpoint=resume_from_checkpoint)

  File "/home/dungalex/.local/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 2634, in load_module_state_dict     
    self.module.load_state_dict(state_dict, # TODO
  File "/home/dungalex/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
        Missing key(s) in state_dict:
tiendung commented 1 year ago

dùng deepspeed stage2 (thi thoảng) vẫn xảy ra lỗi ?!?

tiendung commented 1 year ago

Tạm thời không dùng deepspeed