Closed sunxiaojie99 closed 2 months ago
hi, @sunxiaojie99 I faced the same problem (using Mistral-7B model). If you have solved this problem, how did you solve it?
hi, @sunxiaojie99 I faced the same problem (using Mistral-7B model). If you have solved this problem, how did you solve it?
Hi~ Yes, I have solved this problem. In my case, after many attempts, I found that if I delete the "safetensors" file in the output directory, the model will load successfully.
Hi sorry for the late reply. Thanks to the latest PR from @ArvinZhuang the safetensor issue should be fixed. Feel free to follow up if there is still error with this.
I delete the "safetensors" file in the output directory, the model will load successfully.
Thank you for your support. It worked well for my issue.
And I will also try the revised code :)
Hi! When I use
Mistral-7B-Instruct-v0-1
as thebase_model
, and run repllama, following training with Lora, I met some errors like: "size mismatch for base_model.model.model.layers.29.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, while the shape in the current model is torch.Size([14336, 8])."This error happened at this line: lora_model = PeftModel.from_pretrained(base_model, lora_name_or_path, config=lora_config).
My Transformers version is transformers==4.38.0, since 4.33.0 doesn't support Mistral.
My commands are: