young-geng / EasyLM

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Apache License 2.0
2.38k stars 254 forks source link

Config file to train openllama 7B v2 #73

Closed yhcc closed 1 year ago

yhcc commented 1 year ago

Thanks for release openllama 7Bv2, I am wondering whether this example script is the script you used to train it https://github.com/young-geng/EasyLM/blob/main/examples/pretrain_llama_7b.sh

young-geng commented 1 year ago

For v2, we used half the batch size, half the learning rate but double the number of steps. Other than that, the configuration is the same as v1. I believe these differences probably don't matter nearly as much as the differences in datasets.

yhcc commented 1 year ago

Thanks for you replay. I have another questions on the OpenLLaMA V2 training, thanks for your patience in advance. Did you apply extra cleaning on these data splits, or just used these dataset as it is.

young-geng commented 1 year ago

We didn't apply any extra cleaning. We simply combined the dataset and shuffled the examples.