young-geng / EasyLM

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Apache License 2.0
2.33k stars 247 forks source link

Converting the Koala Weights to HF Transformers #89

Closed xiujiesong closed 11 months ago

xiujiesong commented 11 months ago

Hi,

Thanks for the great work. To convert the Koala model weights to HuggingFace Transformers format, the following script is given:

python -m EasyLM.models.llama.convert_easylm_to_hf \
    --load_checkpoint='params::path/to/koala/checkpoint' \
    --tokenizer_path='path/to/llama/tokenizer' \
    --model_size='13b' \  # '7b', '13b', '30b' or '65b'
    --output_dir='path/to/output/huggingface/koala/checkpoint'

so the load_checkpoint is the path to weights diff (e.g. koala_13b_diff_v2) and tokenizer_path is the path to the Meta's released weights of LLama, right?

young-geng commented 11 months ago

Due to the llama license, we are not able to directly release Koala weights. Therefore, we released it as a diff against the base LLaMA model weights. Therefore, you need to obtain the base LLaMA weights from Meta and recover the Koala weights first. Please follow the Koala documentation here.