young-geng / EasyLM

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Apache License 2.0
2.38k stars 254 forks source link

HF->EasyLM checkpoint conversion #68

Closed syzymon closed 1 year ago

syzymon commented 1 year ago

Following up on the issue: https://github.com/young-geng/EasyLM/issues/7

I cleaned up a script proposed by @Lisennlp and managed to successfully convert XGen 7B model which is llama-compatible: https://huggingface.co/Salesforce/xgen-7b-8k-base from HF to EasyLM, so that it could be fine-tuned with EasyLM. The converted model seems to work reasonably well in EasyLM (math-arXiv perplexity improves from 2K to 8K context length from 2.81 to 2.46).

Credits to @Lisennlp for creating the script

young-geng commented 1 year ago

This is awesome! Thanks a lot for making this script.