hf-lin / ChatMusician

188 stars 21 forks source link

Questions about the merge script - why not type auto? #12

Closed petergreis closed 6 months ago

petergreis commented 6 months ago

Just to report some success, this seems to work (subject to inference testing):

python model/train/merge.py --ori_model_dir ../chatmusician_model_tokenizer --model_dir model/train/output_dir/epoch-2-step-200 --output_dir new_out

Looking further at the merge.py script, I ask the following:

why are the base types here: base_model = LlamaForCausalLM.from_pretrained(args.ori_model_dir, torch_dtype=torch.float16) lora_model = PeftModel.from_pretrained(base_model, args.model_dir, torch_dtype=torch.float16)

not set to torch_dtype=t'auto' ?

Also, the input argument --ori_model_dir I find confusing; why not --orig_model_dir ?

hf-lin commented 6 months ago

We just used this precision, so specify this parameter.

why not --orig_model_dir ? I'm not a native English speaker so thank you for reminding me :)

petergreis commented 6 months ago

Thanks