Open surak opened 9 months ago
This is the VLLM worker on Aya-101:
2024-02-15 16:14:20 | ERROR | stderr | Traceback (most recent call last): 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/fastchat/serve/vllm_worker.py", line 290, in <module> 2024-02-15 16:14:20 | ERROR | stderr | engine = AsyncLLMEngine.from_engine_args(engine_args) 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 495, in from_engine_args 2024-02-15 16:14:20 | ERROR | stderr | engine = cls(parallel_config.worker_use_ray, 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 269, in __init__ 2024-02-15 16:14:20 | ERROR | stderr | self.engine = self._init_engine(*args, **kwargs) 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 314, in _init_engine 2024-02-15 16:14:20 | ERROR | stderr | return engine_class(*args, **kwargs) 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 107, in __init__ 2024-02-15 16:14:20 | ERROR | stderr | self._init_workers_ray(placement_group) 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 194, in _init_workers_ray 2024-02-15 16:14:20 | ERROR | stderr | self._run_workers( 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 750, in _run_workers 2024-02-15 16:14:20 | ERROR | stderr | self._run_workers_in_batch(workers, method, *args, **kwargs)) 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 727, in _run_workers_in_batch 2024-02-15 16:14:20 | ERROR | stderr | all_outputs = ray.get(all_outputs) 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper 2024-02-15 16:14:20 | ERROR | stderr | return fn(*args, **kwargs) 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper 2024-02-15 16:14:20 | ERROR | stderr | return func(*args, **kwargs) 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/ray/_private/worker.py", line 2624, in get 2024-02-15 16:14:20 | ERROR | stderr | raise value.as_instanceof_cause() 2024-02-15 16:14:20 | ERROR | stderr | ray.exceptions.RayTaskError(ValueError): ray::RayWorkerVllm.execute_method() (pid=1550269, ip=134.94.1.44, actor_id=4803c3335806647426226c9501000000, repr=<vllm.engine.ray_utils.RayWorkerVllm object at 0x7f2dd969ee50>) 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/engine/ray_utils.py", line 32, in execute_method 2024-02-15 16:14:20 | ERROR | stderr | return executor(*args, **kwargs) 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/worker/worker.py", line 72, in load_model 2024-02-15 16:14:20 | ERROR | stderr | self.model_runner.load_model() 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/worker/model_runner.py", line 36, in load_model 2024-02-15 16:14:20 | ERROR | stderr | self.model = get_model(self.model_config) 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/model_executor/model_loader.py", line 88, in get_model 2024-02-15 16:14:20 | ERROR | stderr | model_class = _get_model_architecture(model_config.hf_config) 2024-02-15 16:14:20 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-02-15 16:14:20 | ERROR | stderr | File "/p/haicluster/llama/FastChat/sc_venv_2024/venv/lib/python3.11/site-packages/vllm/model_executor/model_loader.py", line 82, in _get_model_architecture 2024-02-15 16:14:20 | ERROR | stderr | raise ValueError( 2024-02-15 16:14:20 | ERROR | stderr | ValueError: Model architectures ['T5ForConditionalGeneration'] are not supported for now. Supported architectures: ['AquilaModel', 'AquilaForCausalLM', 'BaiChuanForCausalLM', 'BaichuanForCausalLM', 'BloomForCausalLM', 'ChatGLMModel', 'ChatGLMForConditionalGeneration', 'FalconForCausalLM', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTJForCausalLM', 'GPTNeoXForCausalLM', 'InternLMForCausalLM', 'LlamaForCausalLM', 'LLaMAForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'MptForCausalLM', 'MPTForCausalLM', 'OPTForCausalLM', 'PhiForCausalLM', 'QWenLMHeadModel', 'RWForCausalLM', 'YiForCausalLM'] 2024-02-15 16:14:21 | INFO | stdout | (RayWorkerVllm pid=1550268) MegaBlocks not found. Please install it by `pip install megablocks`. Note that MegaBlocks depends on mosaicml-turbo, which only supports Python 3.10 for now. 2024-02-15 16:14:21 | INFO | stdout | (RayWorkerVllm pid=1550268) STK not found: please see https://github.com/stanford-futuredata/stk srun: error: haicluster1: task 0: Exited with exit code 1
The problem is that Aya101 bases on a different architecture,
Architecture: Same as mt5-xxl
while most of other models are based on CausalLM.
CausalLM
This is the VLLM worker on Aya-101: