opendatahub-io / vllm-tgis-adapter

vLLM adapter for a TGIS-compatible gRPC server.
Apache License 2.0
8 stars 11 forks source link

Setting num-gpus to 0 causes a division by 0 error #83

Open maxdebayser opened 3 months ago

maxdebayser commented 3 months ago

This is a very low-priority bug, I'm just taking notes here for the backlog.

If you try to run python -m vllm_tgis_adapter --num-gpus 0 --device cpu it will cause this error:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/mbayser/IBMProjects/FoundationModels/inference/vllm-tgis-adapter/build/__editable__.vllm_tgis_adapter-0.0.0-py3-none-any/vllm_tgis_adapter/__main__.py", line 59, in <module>
    engine = AsyncLLMEngine.from_engine_args(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mbayser/IBMProjects/FoundationModels/inference/vllm/build/__editable__.vllm-0.5.4+cpu-cp311-cp311-linux_x86_64/vllm/engine/async_llm_engine.py", line 462, in from_engine_args
    engine_config = engine_args.create_engine_config()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mbayser/IBMProjects/FoundationModels/inference/vllm/build/__editable__.vllm-0.5.4+cpu-cp311-cp311-linux_x86_64/vllm/engine/arg_utils.py", line 865, in create_engine_config
    return EngineConfig(
           ^^^^^^^^^^^^^
  File "<string>", line 15, in __init__
  File "/home/mbayser/IBMProjects/FoundationModels/inference/vllm/build/__editable__.vllm-0.5.4+cpu-cp311-cp311-linux_x86_64/vllm/config.py", line 1644, in __post_init__
    self.model_config.verify_with_parallel_config(self.parallel_config)
  File "/home/mbayser/IBMProjects/FoundationModels/inference/vllm/build/__editable__.vllm-0.5.4+cpu-cp311-cp311-linux_x86_64/vllm/config.py", line 273, in verify_with_parallel_config
    if total_num_attention_heads % tensor_parallel_size != 0:
       ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
ZeroDivisionError: integer modulo by zero

num-gpus should be ignored completely if device is cpu. But even in the device is gpu there should be a validation error instead of a zero division error.

dtrifiro commented 3 months ago

This looks like an upstream issue tbh