PygmalionAI / aphrodite-engine

Large-scale LLM inference engine
https://aphrodite.pygmalion.chat
GNU Affero General Public License v3.0
1.03k stars 114 forks source link

[New Model]: Phi3ForCausalLM #478

Closed sparsh35 closed 1 month ago

sparsh35 commented 4 months ago

The model to consider.

https://huggingface.co/microsoft/Phi-3-medium-128k-instruct

I was trying to run the exl2 quants for these models , but getting error at rotatry embedding these models use two rope scaling factors as long_factor and short_factor. Model is good and the vllm , huggingface have a merge which does support this but they don't support exl2.

The closest model Aphrodite already supports.

No response

What's your difficulty of supporting the model you want?

relevant git merges :

https://github.com/vllm-project/vllm/pull/4298

localbarrage commented 3 months ago

Bump

murtaza-nasir commented 3 months ago

Bump

sgsdxzy commented 3 months ago

It's currently in the rc_054 branch, you can test it, please note that some quantizations are broken atm.

AlpinDale commented 1 month ago

Added as of v0.6.0