Closed FloatFrank closed 6 days ago
yaml:### model model_name_or_path: E:\Autodl-Run-Results\Qwen2.5-7b-16r-***\merge-checkpoint-218400 adapter_name_or_path: E:\LargeLanguageModels\LLaMA-Factory\saves\qwen2.5-7b-continue-but-source\checkpoint-3750
stage: sft do_predict: true finetuning_type: lora
eval_dataset: *** template: qwen cutoff_len: 512 overwrite_cache: true preprocessing_num_workers: 16
output_dir: E:\LargeLanguageModels\LLaMA-Factory\saves\qwen2.5-7b-continue-but-source\predict overwrite_output_dir: true
per_device_eval_batch_size: 1 predict_with_generate: true ddp_timeout: 180000000 bf16: true
Reminder
System Info
llamafactory
version: 0.9.1.dev0Reproduction
为什么训练好的checkpoint直接train的Eval来推理非常缓慢,大概比手动merge并自己写脚本来推理慢2.4倍以上(5000个样本要2h40m)。手动merge又会多一些步骤,在需要批量推理时也比较占用磁盘空间,有没有什么别的方法?
Expected behavior
No response
Others
No response