Support for Phi-3-mini-128k-instruct

unslothai / unsloth

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

https://unsloth.ai

Apache License 2.0

12.88k stars 838 forks source link

Support for Phi-3-mini-128k-instruct #700

Open dcsuka opened 3 weeks ago

dcsuka commented 3 weeks ago

Phi-3-mini-128k-instruct has the same number of parameters and same architecture as Phi-3-mini-4k-instruct, unless I am mistaken. Would it be possible for unsloth to support inference for this model as well? Thank you.

danielhanchen commented 2 weeks ago

The issue is RoPE scaling is different for Phi-3 4K and 128K - they use some weird dynamic RoPE scaling which I haven't yet looked into