zhangzhenyu13 / llm3s-conatiner

large language model training-3-stages+deployment
46 stars 12 forks source link

Complete docs

detailed full doc for everything/完全体文档

Install envs

first install pytorch2.0 https://pytorch.org/get-started/locally/ then install others pip install -r requirements.txt

deploy necessary settings

run train SFT model

bash run.sh

run train Reward model

bash run-reward.sh

run train RLHF model

bash run-rlhf.sh

Prepare data

SFT data

refer sft-data-construction

reward data and RLHF data

refer rlhf-ppo