sasha0552 / vllm-ci

CI scripts designed to build a Pascal-compatible version of vLLM.
MIT License
11 stars 1 forks source link

support for Phi3 Medium models #3

Closed rkyla closed 2 months ago

rkyla commented 3 months ago

Hi, I have tried to load the Phi3 Medium model (128k), but it fails to work with the current version of VLLM here, is this a version update issue? and when I try the Phi3 Mini 128k, it at least tries to load it, but runs out of memory (even 4x 24gb cards can't fit it - I suspect some bug)

Thanks a lot for creating this patch! @sasha0552

sasha0552 commented 3 months ago

Hi @rkyla.

I have started the process of building the v6 release, which will be based on the https://github.com/vllm-project/vllm/commit/1744cc99ba9bdefea8f3f798cf51ed650b81a98e (current main branch). It'll be ready in about two hours. For now, you can try v5, which was built 3 days ago (I forgot to publish v5, so you probably used v4, which was built a week ago).

If the v6 update doesn't help, I suggest opening the issue in the vLLM repo, as it's probably not a GPU issue.

rkyla commented 3 months ago

Thanks a lot @sasha0552! I have tried to check my VLLM version before the update, and it said 0.5.0.post1, to upgrade to the latest, do I just run: pip3 install --extra-index-url https://sasha0552.github.io/vllm-ci/ vllm ?

sasha0552 commented 3 months ago

You need to use pip3 install --force-reinstall --extra-index-url https://sasha0552.github.io/vllm-ci/ --no-cache-dir --no-deps --upgrade vllm to update vLLM between releases of same version (v4, v5, v6 and now v7 are the same 0.5.0.post1 version of vLLM, but with different commit used as base).