intelligent-machine-learning / dlrover

DLRover: An Automatic Distributed Deep Learning System
Other
1.21k stars 149 forks source link

Question: How DLRover integrate with Llama Factory? #1244

Open hetingyou opened 1 month ago

hetingyou commented 1 month ago

直觉是修改examples/pytorch/nanogpt/elastic_job.yaml:

command:

改为如下形式,报错:找不到 llamafactory-cli这个文件,也即是必须后面需要跟train.py文件?

command: