Open dcsuka opened 3 weeks ago
Phi-3-mini-128k-instruct has the same number of parameters and same architecture as Phi-3-mini-4k-instruct, unless I am mistaken. Would it be possible for unsloth to support inference for this model as well? Thank you.
The issue is RoPE scaling is different for Phi-3 4K and 128K - they use some weird dynamic RoPE scaling which I haven't yet looked into
Phi-3-mini-128k-instruct has the same number of parameters and same architecture as Phi-3-mini-4k-instruct, unless I am mistaken. Would it be possible for unsloth to support inference for this model as well? Thank you.