Open judyhappy opened 1 year ago
Step 1 and Step 2 don't generate any config.json. So which config.json should be used for step 3?
Step 1 and Step 2 don't generate any config.json. So which config.json should be used for step 3?
Hello, have you solved this problem yet? Could you tell me which config.json should be used for step 3? Thank you!!!!
Hi Jason,
I followed the steps Step 1 - Supervised Fine-tuning, generate "/checkpoints/supervised_llama/" including folders:
Step 2 Training Reward Model, generate "/checkpoints/training_reward_model/" including folders:
Step 3 Tuning LM with PPO.
But there is an Error:
There is no config.json under supervised_llama or training_reward_model.