OpenNMT / CTranslate2

Fast inference engine for Transformer models
https://opennmt.net/CTranslate2
MIT License
3.17k stars 280 forks source link

Phi-3 support #1672

Closed Theodotus1243 closed 4 months ago

Theodotus1243 commented 4 months ago

Powerful model trained on syntetic data, has high MMLU

4K context window one should be easier, as has no LongRope

https://huggingface.co/microsoft/Phi-3-mini-4k-instruct https://arxiv.org/pdf/2404.14219.pdf

BBC-Esq commented 4 months ago

I second this. The current phi loader is broken, apparently because of some changes that Microsoft did to the model after it was initially released. At any rate, adapting the phi loader to the new phi3 should be easier than starting from scratch.

jncraton commented 4 months ago

For anyone else researching this, phi3 support has been added to the convert_hf_to_gguf.py script in llama.cpp. Perhaps something can be gleaned from there to simplify the implementation of the ct2 converter.

vince62s commented 4 months ago

no worry it will be done, it's quite easy for the mini-4k since it takes all llama2 arch. fyi: https://forum.opennmt.net/t/phi-3-3-8b-llama2-7b-ensemble-just-for-fun/5729

BBC-Esq commented 4 months ago

Is it done yet? I've been waiting patiently for approximately two hours now? ;-)

minhthuc2502 commented 4 months ago

Hello, I am working on it. Some unexpected problems appears.

BBC-Esq commented 4 months ago

I'm not skilled enough to help directly by implementing the code...but if you want me to do any grunt work or research let me know dude...anything to assist speed up the process. Thanks!

BBC-Esq commented 4 months ago

I'd like to start learning to eventually possibly help...Question...how do I get the actual model architecture to start with...It's my understanding that getting the model's structure, what activation functions are used, etc. and basically starting to understanding the structure is key in making additional converters down the road. For example, here's a link:

https://bbycroft.net/llm

Here are some other links that I've been gathering with the goal of eventually contributing a converter...based on first trying to understand the structure of LLMs...

https://github.com/mert-kurttutan/torchview

https://github.com/lutzroeder/netron

Huggingface sometimes (but not always) has information like this...

image

Basically, any good starting point for me that you'd recommend dude? Thanks!

BBC-Esq commented 4 months ago

Remember, you're dealing with an idiot who doesn't do this for a profession and has never taken the LLM 101 class in college let alone have a doctoral degree. ;-) I don't even know what "mlp.down" or "layernorm.weight" means, for example, but am willing to learn.

minhthuc2502 commented 4 months ago

PR #1680 to add the converter for Phi3