Closed Bostoncake closed 2 months ago
It is a typo, now fixed.
would you upload training scripts for llama-2 13B in the future?
I currently do not have enough spare compute to experiment with that. :(
Thanks for your reply! I am currently working on 13B models. Hope things could work out. If so, I will create PRs and share my training solutions.
@Bostoncake Thanks that would be very helpful!
In
train_scripts/EasyContext-1M-Llama-2-7B.sh
, line 53 specifies--model PY007/Llama2-7B-64K
. Why isn't it--model ./output/7B_64K_bs_1M_rope_5M_step_1000_lr_2e-5
, which is the output model of the previous training process?Also, would you upload training scripts for llama-2 13B in the future? I really appreciate this work and I am looking forward to it. Thanks!