runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
242 stars 97 forks source link

ValueError: rope_scaling must be a dictionary with two fields, type and factor #89

Open omar93939 opened 3 months ago

omar93939 commented 3 months ago
Traceback (most recent call last):
2024-08-01T21:29:17.880522621Z   File "/src/handler.py", line 6, in <module>
2024-08-01T21:29:17.880527641Z     vllm_engine = vLLMEngine()
2024-08-01T21:29:17.880533331Z   File "/src/engine.py", line 25, in __init__
2024-08-01T21:29:17.880543011Z     self.llm = self._initialize_llm() if engine is None else engine
2024-08-01T21:29:17.880549201Z   File "/src/engine.py", line 111, in _initialize_llm
2024-08-01T21:29:17.880554071Z     raise e
2024-08-01T21:29:17.880559761Z   File "/src/engine.py", line 105, in _initialize_llm
2024-08-01T21:29:17.880582981Z     engine = AsyncLLMEngine.from_engine_args(AsyncEngineArgs(**self.config))
2024-08-01T21:29:17.880598581Z   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 346, in from_engine_args
2024-08-01T21:29:17.880669700Z     engine_config = engine_args.create_engine_config()
2024-08-01T21:29:17.880679090Z   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/arg_utils.py", line 520, in create_engine_config
2024-08-01T21:29:17.880798769Z     model_config = ModelConfig(
2024-08-01T21:29:17.880806039Z   File "/usr/local/lib/python3.10/dist-packages/vllm/config.py", line 119, in __init__
2024-08-01T21:29:17.880814579Z     self.hf_config = get_config(self.model, trust_remote_code, revision,
2024-08-01T21:29:17.880909728Z   File "/usr/local/lib/python3.10/dist-packages/vllm/transformers_utils/config.py", line 38, in get_config
2024-08-01T21:29:17.880929687Z     raise e
2024-08-01T21:29:17.880935207Z   File "/usr/local/lib/python3.10/dist-packages/vllm/transformers_utils/config.py", line 23, in get_config
2024-08-01T21:29:17.881092116Z     config = AutoConfig.from_pretrained(
2024-08-01T21:29:17.881125706Z   File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 958, in from_pretrained
2024-08-01T21:29:17.881143606Z     return config_class.from_dict(config_dict, **unused_kwargs)
2024-08-01T21:29:17.881151795Z   File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 768, in from_dict
2024-08-01T21:29:17.881304244Z     config = cls(**config_dict)
2024-08-01T21:29:17.881319394Z   File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/configuration_llama.py", line 161, in __init__
2024-08-01T21:29:17.881325854Z     self._rope_scaling_validation()
2024-08-01T21:29:17.881334464Z   File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/configuration_llama.py", line 182, in _rope_scaling_validation
2024-08-01T21:29:17.881384983Z     raise ValueError(
2024-08-01T21:29:17.881394323Z ValueError: rope_scaling must be a dictionary with two fields, type and factor, got {'factor': 8.0, 'high_freq_factor': 4.0, 'low_freq_factor': 1.0, 'original_max_position_embeddings': 8192, 'rope_type': 'llama3'}

As far as I know, this issue is fixed simply by upgrading Transformers (https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct/discussions/15)

ashercn97 commented 3 months ago

Yes!

toomanydev commented 3 months ago

Yep, this will be necessary for testing Llama 3.1 (including just-released 405B, which is exciting)