vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
29.59k stars 4.46k forks source link

[Feature]: lazy import for VLM #6187

Closed zhyncs closed 3 months ago

zhyncs commented 4 months ago

šŸš€ The feature, motivation and pitch

I used vLLM 0.5.0.post1 for Mixtral-8x7B-Instruct-v0.1 inference

python3 -m vllm.entrypoints.openai.api_server --model /workdir/Mixtral-8x7B-Instruct-v0.1 --tensor-parallel-size 2

and get the error

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1560, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
  File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/usr/local/lib/python3.9/site-packages/transformers/models/auto/image_processing_auto.py", line 27, in <module>
    from ...image_processing_utils import BaseImageProcessor, ImageProcessingMixin
  File "/usr/local/lib/python3.9/site-packages/transformers/image_processing_utils.py", line 21, in <module>
    from .image_transforms import center_crop, normalize, rescale
  File "/usr/local/lib/python3.9/site-packages/transformers/image_transforms.py", line 22, in <module>
    from .image_utils import (
  File "/usr/local/lib/python3.9/site-packages/transformers/image_utils.py", line 58, in <module>
    from torchvision.transforms import InterpolationMode
  File "/usr/local/lib/python3.9/site-packages/torchvision/__init__.py", line 6, in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils
  File "/usr/local/lib/python3.9/site-packages/torchvision/_meta_registrations.py", line 164, in <module>
    def meta_nms(dets, scores, iou_threshold):
  File "/usr/local/lib/python3.9/site-packages/torch/library.py", line 467, in inner
    handle = entry.abstract_impl.register(func_to_register, source)
  File "/usr/local/lib/python3.9/site-packages/torch/_library/abstract_impl.py", line 30, in register
    if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
RuntimeError: operator torchvision::nms does not exist

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.9/site-packages/vllm/entrypoints/openai/api_server.py", line 26, in <module>
    from vllm.entrypoints.openai.serving_chat import OpenAIServingChat
  File "/usr/local/lib/python3.9/site-packages/vllm/entrypoints/openai/serving_chat.py", line 29, in <module>
    from vllm.multimodal.image import ImagePixelData
  File "/usr/local/lib/python3.9/site-packages/vllm/multimodal/__init__.py", line 2, in <module>
    from .registry import MULTIMODAL_REGISTRY, MultiModalRegistry
  File "/usr/local/lib/python3.9/site-packages/vllm/multimodal/registry.py", line 9, in <module>
    from .image import (ImageFeatureData, ImageFeaturePlugin, ImagePixelData,
  File "/usr/local/lib/python3.9/site-packages/vllm/multimodal/image.py", line 9, in <module>
    from vllm.transformers_utils.image_processor import cached_get_image_processor
  File "/usr/local/lib/python3.9/site-packages/vllm/transformers_utils/image_processor.py", line 4, in <module>
    from transformers import AutoImageProcessor
  File "<frozen importlib._bootstrap>", line 1055, in _handle_fromlist
  File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1551, in __getattr__
    value = getattr(module, name)
  File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1550, in __getattr__
    module = self._get_module(self._class_to_module[name])
  File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1562, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.auto.image_processing_auto because of the following error (look up to see its traceback):
operator torchvision::nms does not exist

Considering that I did not use the VLM function, only the LLM, there should not be a strong dependency on the VLM component. Maybe we could implement lazy importing for the VLM-related module, similar to https://github.com/InternLM/lmdeploy/pull/1714. Do you have any suggestions? And I'm glad to land the PR for this. Thanks. cc @ywang96 @simon-mo

Alternatives

No response

Additional context

No response

DarkLight1337 commented 3 months ago

AutoImageProcessor is lazy-imported from v0.5.1 onwards so this should no longer be a problem.

https://github.com/vllm-project/vllm/blob/main/vllm/transformers_utils/image_processor.py#L1-L15

zhyncs commented 3 months ago

Hi @DarkLight1337 Ok. I'll try the latest version.