OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
https://optimalscale.github.io/LMFlow/
Apache License 2.0
8.27k stars 828 forks source link

[BUG]TypeError: Field.__init__() missing 1 required positional argument: 'kw_only' #903

Closed ORGRUI closed 1 week ago

ORGRUI commented 1 month ago

To Reproduce bash scripts/run_finetune_with_qlora_9_24_4.sh

!/bin/bash

Please run this script under ${project_id} in project directory of

Parses arguments

model_name_or_path=LLM-Research/Meta-Llama-3-70B-Instruct

model_name_or_path=/data/hf_cache/hub/models--meta-llama--Meta-Llama-3-70B-Instruct/snapshots/5fcb2901844dde3111159f24205b71c25900ffbd

lora_model_path=/data/midreal/rio/lora/opening_0923

dataset_path=/data/midreal/rio/LMFlow-main/data/opening conversation_template=llama3 output_dir=/data/midreal/rio/LMFlow-main/output_models/qlora_finetuned_llama3_70b_opening_model_9_27 deepspeed_args="--master_port=13001 --include localhost:1"

Safety related arguments

trust_remote_code=0

while [[ $# -ge 1 ]]; do key="$1" case ${key} in -m|--model_name_or_path) model_name_or_path="$2" shift ;; -d|--dataset_path) dataset_path="$2" shift ;; --conversation_template) conversation_template="$2" shift ;; -o|--output_model_path) output_dir="$2" shift ;; --deepspeed_args) deepspeed_args="$2" shift ;; --trust_remote_code) trust_remote_code="$2" shift ;; *) echo "error: unknown option \"${key}\"" 1>&2 exit 1 esac shift done

Finetune

exp_id=finetune_with_qlora_09_27 project_dir=$(cd "$(dirname $0)"/..; pwd) log_dir=${project_dir}/log/${exp_id} mkdir -p ${output_dir} ${log_dir}

deepspeed ${deepspeed_args} \ examples/finetune.py \ --model_name_or_path ${model_name_or_path} \ --trust_remote_code ${trust_remote_code} \ --dataset_path ${dataset_path} \ --conversation_template ${conversation_template} \ --output_dir ${output_dir} --overwrite_output_dir \ --num_train_epochs 3 \ --learning_rate 1e-4 \ --block_size 1024 \ --per_device_train_batch_size 1 \ --use_qlora 1 \ --save_aggregated_lora 0 \ --deepspeed configs/ds_config_zero2.json \ --fp16 \ --run_name ${exp_id} \ --validation_split_percentage 0 \ --logging_steps 20 \ --do_train \ --ddp_timeout 72000 \ --save_steps 200 \ --dataloader_num_workers 1 \ | tee ${log_dir}/train.log \ 2> ${log_dir}/train.err

image
wheresmyhair commented 1 month ago

Could you please try using python=3.9?

conda create -n lmflow python=3.9 -y
shizhediao commented 1 month ago

I also encounter this problem!

wheresmyhair commented 1 week ago

Solved in the latest pr #905