Looking further at the merge.py script, I ask the following:
why are the base types here:
base_model = LlamaForCausalLM.from_pretrained(args.ori_model_dir, torch_dtype=torch.float16) lora_model = PeftModel.from_pretrained(base_model, args.model_dir, torch_dtype=torch.float16)
not set to torch_dtype=t'auto' ?
Also, the input argument --ori_model_dir I find confusing; why not --orig_model_dir ?
Just to report some success, this seems to work (subject to inference testing):
python model/train/merge.py --ori_model_dir ../chatmusician_model_tokenizer --model_dir model/train/output_dir/epoch-2-step-200 --output_dir new_out
Looking further at the merge.py script, I ask the following:
why are the base types here:
base_model = LlamaForCausalLM.from_pretrained(args.ori_model_dir, torch_dtype=torch.float16) lora_model = PeftModel.from_pretrained(base_model, args.model_dir, torch_dtype=torch.float16)
not set to
torch_dtype=t'auto'
?Also, the input argument
--ori_model_dir
I find confusing; why not--orig_model_dir
?