ggerganov / llama.cpp

LLM inference in C/C++
MIT License
64.81k stars 9.29k forks source link

Add JapaneseStableLM Support #3373

Closed azulika closed 5 months ago

azulika commented 11 months ago

Hi, I wanted to convert the bin files of https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b into .gguf, but it doesn't seem to work. Based on the past issue(https://github.com/ggerganov/llama.cpp/issues/1063), StableLM's architecture appears to be the same as GPT-NeoX, but convert-gptneox-hf-to-gguf.py didn't work.

azulika commented 11 months ago

According to https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b/discussions/5, stablelm-base-alpha-7b and japanese-stablelm series are using different prefixes for layers. I don't think that's the only reason the conversion fails though, as replacing gpt_neox with transformer still didn't work, although showed some progress.

azulika commented 11 months ago

It seems the JapaneseStableLM's architecture has three unique layer names: packed_input_proj and out_proj, and attention.rotary_emb.scale, while lacking these three: input_layernorm, dense_4h_to_h, dense_h_to_4h, which StableLM and perhaps normal gpt_neox models have. I'm not quite familiar with model structures, but I guess supporting these layer names might do the trick.

[Edit] I renamed gguf.py's layer names accordingly, and exporting to .gguf itself has succeeded with that, but when I tried to load the model it results in an error error loading model: invalid n_rot: 32, expected 128. It seems the value 32 refers to either num_attention_heads or num_hidden_layers in japanese-stablelm-instruct-alpha-7b's config.json. But changing the json's value to 128 doesn't work as it results in error loading model: invalid n_rot: 8, expected 32 instead. I suppose rotary_pct: 0.25 in config.json is doing something wrong or not working correctly in the middle of conversion to gguf.

github-actions[bot] commented 5 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.