All HF llama model are falcon style ROPE and we can convert them to original llama style ROPE with a permutation.
This pull request solve the bug when converting HF GQA to gguf format.
I learned idea from it and fix the similar bug in the llama2.c's exports.py.
Now I successfully convert Tinyllama-1.1B-chat to llama style ROPE. So, we can remove the falcon ROPE part.
I have upload the new export.py and llama2.mojo.
Details:
python export.py tl-chat.bin --hf PY007/TinyLlama-1.1B-Chat-v0.2 --version 0 to conver the model
All HF llama model are falcon style ROPE and we can convert them to original llama style ROPE with a permutation. This pull request solve the bug when converting HF GQA to gguf format. I learned idea from it and fix the similar bug in the llama2.c's exports.py. Now I successfully convert Tinyllama-1.1B-chat to llama style ROPE. So, we can remove the falcon ROPE part. I have upload the new export.py and llama2.mojo.
Details:
python export.py tl-chat.bin --hf PY007/TinyLlama-1.1B-Chat-v0.2 --version 0
to conver the model