jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Apache License 2.0
7.3k stars 425 forks source link

Convert weights to original llama weights. #149

Closed PSanni closed 4 months ago

PSanni commented 5 months ago

Hi,

Is it possible to convert this weights into https://github.com/facebookresearch/llama/tree/main/llama format?

ChaosCodes commented 4 months ago

Sorry for that we are not planing to support to convert the weights for lit format into the original llama in our repo. Can you check for this https://github.com/facebookresearch/llama-recipes/blob/98fcc538ff82bd8987b31026dd7f21c01bc6f46b/src/llama_recipes/tools/convert_hf_weights_to_llama.py#L4 ?

cnlnpjhsy commented 2 months ago

Have anyone successfully converted it to original llama weights? I followed the code provided by @ChaosCodes but got the error. It seemed that some dimensions are not match

  File "convert_hf_weights_to_llama.py", line 163, in main
    write_model(model_path, model_size, output_dir)
  File "convert_hf_weights_to_llama.py", line 90, in write_model
    permute(
  File "convert_hf_weights_to_llama.py", line 58, in permute
    w.view(n_heads, 2, dim1 // n_heads // 2, dim2)
RuntimeError: shape '[4, 2, 64, 2048]' is invalid for input of size 524288

My params.json:

{"dim": 2048, "multiple_of": 256, "n_heads": 32, "n_kv_heads": 4, "n_layers": 22, "norm_eps": 1e-05, "vocab_size": -1}