skypilot-org / skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.82k stars 513 forks source link

[Examples] Specify version for vllm cuz vllm v0.6.4.post1 has issue #4391

Closed HysunHe closed 10 hours ago

HysunHe commented 15 hours ago

The latest released vLLM v0.6.4.post1 has issue. So we specify the vLLM version as v0.6.3.post1 in the serve-qwen-7b.yaml file. So we specify the vLLM to v0.6.3.post1, which works fine.

The current latest released version of vLLM is 0.6.4.post1. Exec serve-qwen-7b.yaml will cuz the following issue due to the vllm version 0.6.4.post1:

(qwen-llm2, pid=9356) File "/home/ubuntu/miniforge3/envs/vllm/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 1778, in _get_module (qwen-llm2, pid=9356) return importlib.import_module("." + module_name, self.name) (qwen-llm2, pid=9356) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (qwen-llm2, pid=9356) File "/home/ubuntu/miniforge3/envs/vllm/lib/python3.12/importlib/init.py", line 90, in import_module (qwen-llm2, pid=9356) return _bootstrap._gcd_import(name[level:], package, level) (qwen-llm2, pid=9356) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (qwen-llm2, pid=9356) File "", line 1387, in _gcd_import (qwen-llm2, pid=9356) File "", line 1360, in _find_and_load (qwen-llm2, pid=9356) File "", line 1331, in _find_and_load_unlocked (qwen-llm2, pid=9356) File "", line 935, in _load_unlocked (qwen-llm2, pid=9356) File "", line 995, in exec_module (qwen-llm2, pid=9356) File "", line 488, in _call_with_frames_removed (qwen-llm2, pid=9356) File "/home/ubuntu/miniforge3/envs/vllm/lib/python3.12/site-packages/transformers/processing_utils.py", line 33, in (qwen-llm2, pid=9356) from .image_utils import ChannelDimension, is_valid_image, is_vision_available (qwen-llm2, pid=9356) File "/home/ubuntu/miniforge3/envs/vllm/lib/python3.12/site-packages/transformers/image_utils.py", line 58, in (qwen-llm2, pid=9356) from torchvision.transforms import InterpolationMode (qwen-llm2, pid=9356) File "/home/ubuntu/miniforge3/envs/vllm/lib/python3.12/site-packages/torchvision/init.py", line 10, in (qwen-llm2, pid=9356) from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils # usort:skip (qwen-llm2, pid=9356) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (qwen-llm2, pid=9356) File "/home/ubuntu/miniforge3/envs/vllm/lib/python3.12/site-packages/torchvision/_meta_registrations.py", line 163, in (qwen-llm2, pid=9356) @torch.library.register_fake("torchvision::nms") (qwen-llm2, pid=9356) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (qwen-llm2, pid=9356) File "/home/ubuntu/miniforge3/envs/vllm/lib/python3.12/site-packages/torch/library.py", line 654, in register (qwen-llm2, pid=9356) use_lib._register_fake(op_name, func, _stacklevel=stacklevel + 1) (qwen-llm2, pid=9356) File "/home/ubuntu/miniforge3/envs/vllm/lib/python3.12/site-packages/torch/library.py", line 154, in _register_fake (qwen-llm2, pid=9356) handle = entry.abstract_impl.register(func_to_register, source) (qwen-llm2, pid=9356) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (qwen-llm2, pid=9356) File "/home/ubuntu/miniforge3/envs/vllm/lib/python3.12/site-packages/torch/_library/abstract_impl.py", line 31, in register (qwen-llm2, pid=9356) if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"): (qwen-llm2, pid=9356) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (qwen-llm2, pid=9356) RuntimeError: operator torchvision::nms does not exist (qwen-llm2, pid=9356) (qwen-llm2, pid=9356) The above exception was the direct cause of the following exception: (qwen-llm2, pid=9356) (qwen-llm2, pid=9356) Traceback (most recent call last): (qwen-llm2, pid=9356) File "", line 189, in _run_module_as_main (qwen-llm2, pid=9356) File "", line 112, in _get_module_details (qwen-llm2, pid=9356) File "/home/ubuntu/miniforge3/envs/vllm/lib/python3.12/site-packages/vllm/init.py", line 3, in

sky serve up examples/oci/serve-qwen-7b.yaml -n qwen-llm

Tested (run the relevant ones):