[Bug]: TypeError in benchmark_serving.py when using --model parameter

Your current environment

The output of `python collect_env.py`

PyTorch version: 2.3.1+cpu Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Pro GCC version: Could not collect Clang version: Could not collect CMake version: Could not collect Libc version: N/A

Python version: 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-10-10.0.22621-SP0 Is CUDA available: False CUDA runtime version: No CUDA CUDA_MODULE_LOADING set to: N/A GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Architecture=9 CurrentClockSpeed=1910 DeviceID=CPU0 Family=198 L2CacheSize=1024 L2CacheSpeed= Manufacturer=GenuineIntel MaxClockSpeed=2112 Name=Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz ProcessorType=3 Revision=

Versions of relevant libraries: [pip3] numpy==1.26.4 [pip3] optree==0.11.0 [pip3] torch==2.3.1 [pip3] transformers==4.41.2 [conda] Could not collect ROCM Version: Could not collect Neuron SDK Version: N/A vLLM Version: N/A vLLM Build Flags: CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled GPU Topology: Could not collect

🐛 Describe the bug

When running benchmark_serving.py with the --model parameter, I encountered a TypeError related to the snapshot_download() function.

Error message: TypeError: snapshot_download() got an unexpected keyword argument 'model_id'

Root cause: The issue lies in the get_model() function defined in backend_request_func.py. This function calls snapshot_download() from either modelscope or huggingface_hub depending on the VLLM_USE_MODELSCOPE environment variable. Code Snippet:

def get_model(pretrained_model_name_or_path: str):
    if os.getenv('VLLM_USE_MODELSCOPE', 'False').lower() == 'true':
        from modelscope import snapshot_download
    else:
        from huggingface_hub import snapshot_download

    model_path = snapshot_download(
        model_id=pretrained_model_name_or_path,
        local_files_only=huggingface_hub.constants.HF_HUB_OFFLINE,
        ignore_file_pattern=[".*.pt", ".*.safetensors", ".*.bin"])
    return model_path

Problem: The snapshot_download() function from huggingface_hub does not recognize the model_id and ignore_file_pattern parameters. According to the Hugging Face Hub documentation, model_id should be replaced with repo_id, and ignore_file_pattern should be replaced with ignore_patterns.

Using the pattern ".*.safetensors" leads to the download of the safetensors files anyway. It should be updated to "*.safetensors".

Let me know if misunderstood something.

vllm-project / vllm

[Bug]: TypeError in benchmark_serving.py when using --model parameter #6069

Your current environment

🐛 Describe the bug