kto训练要求response大于1(feedback_dataset)函数

hiyouga / LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs

Apache License 2.0

25.26k stars 3.13k forks source link

kto训练要求response大于1(feedback_dataset)函数 #4564

Closed tcxia closed 2 days ago

tcxia commented 2 days ago

Reminder

[X] I have read the README and searched the existing issues.

System Info

Reproduction

NPROC_PER_NODE=7 NNODES=1 RANK=0 MASTER_ADDR=127.0.0.1 MASTER_PORT=29504

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6 torchrun \ --nproc_per_node $NPROC_PER_NODE \ --nnodes $NNODES \ --node_rank $RANK \ --master_addr $MASTER_ADDR \ --master_port $MASTER_PORT \ src/train.py examples/aigc_train/llama3/llama3_lora_kto.yaml | tee train_aigc_kto.log