Open hengran opened 3 months ago
Hi @hengran, this is a relevant one https://github.com/texttron/tevatron/issues/118. I guess you can delete the safetensor ckpt and use the adaptor_model.bin in the saved lora ckpt (if it is not trained/saved in latest version)
I have the same problem, have you solved it?
Hi, i meet a problem when load the model using the "tevatron/retriever/driver/encode.py". "lora_name_or_path" is trained with "tevatron/retriever/driver/train.py". I am confused by this problem. Traceback (most recent call last): File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/root/paddlejob/workspace/env_run/llm-index/src/tevatron/retriever/driver/encode.py", line 119, in
main()
File "/root/paddlejob/workspace/env_run/llm-index/src/tevatron/retriever/driver/encode.py", line 63, in main
model = DenseModel.load(
File "/root/paddlejob/workspace/env_run/llm-index/src/tevatron/retriever/modeling/encoder.py", line 175, in load
lora_model = PeftModel.from_pretrained(base_model, lora_name_or_path, config=lora_config, use_safetensors=False)
File "/usr/local/lib/python3.8/dist-packages/peft/peft_model.py", line 356, in from_pretrained
model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/peft/peft_model.py", line 730, in load_adapter
load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
File "/usr/local/lib/python3.8/dist-packages/peft/utils/save_and_load.py", line 249, in set_peft_model_state_dict
load_result = model.load_state_dict(peft_model_state_dict, strict=False)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 2189, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PeftModelForFeatureExtraction:
size mismatch for base_model.model.layers.0.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([11008, 8]).
size mismatch for base_model.model.layers.0.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([11008, 8]).
size mismatch for base_model.model.layers.0.mlp.downproj.lora