[Bug]: seq_group_metadata.encoder_seq_data.get_len() AttributeError: 'NoneType' object has no attribute 'get_len'

Your current environment

Due to network isolation, I am currently unable to run scripts. I use 8* h100 80G the run command vllm serve /models/Llama-3.2-90B-Vision-Instruct/ --dtype auto --tensor_parallel_size 8 --max-num-seqs 32 --enforce-eager --gpu_memory_utilization 0.95 --max_model_len 8192 --max_seq_len_to_capture 8192 --speculative_model "[ngram]" --num_speculative_tokens 5 --ngram_prompt_lookup_max 4 --use_v2_block_manager

Model Input Dumps

No response

🐛 Describe the bug

(VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] Exception in worker VllmWorkerProcess while processing method determine_num_available_blocks. (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] Traceback (most recent call last): (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] File "/root/anaconda3/lib/python3.9/site-packages/vllm/executor/multiproc_worker_utils.py", line 223, in _run_worker_process (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] output = executor(*args, kwargs) (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] File "/root/anaconda3/lib/python3.9/site-packages/vllm/spec_decode/spec_decode_worker.py", line 361, in determine_num_available_blocks (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] self.scorer_worker.determine_num_available_blocks()) (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] File "/root/anaconda3/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] return func(*args, *kwargs) (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] File "/root/anaconda3/lib/python3.9/site-packages/vllm/worker/worker.py", line 223, in determine_num_available_blocks (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] self.model_runner.profile_run() (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] File "/root/anaconda3/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] return func(args, kwargs) (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] File "/root/anaconda3/lib/python3.9/site-packages/vllm/worker/model_runner.py", line 1289, in profile_run (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] model_input = self.prepare_model_input( (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] File "/root/anaconda3/lib/python3.9/site-packages/vllm/spec_decode/target_model_runner.py", line 60, in prepare_model_input (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] model_input: ModelInputForGPUWithSamplingMetadata = super( (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] File "/root/anaconda3/lib/python3.9/site-packages/vllm/worker/model_runner.py", line 1586, in prepare_model_input (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] model_input = self._prepare_model_input_tensors( (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] File "/root/anaconda3/lib/python3.9/site-packages/vllm/worker/model_runner.py", line 1192, in _prepare_model_input_tensors (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] builder.add_seq_group(seq_group_metadata) (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] File "/root/anaconda3/lib/python3.9/site-packages/vllm/worker/model_runner.py", line 693, in add_seq_group (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] encoder_seq_len = seq_group_metadata.encoder_seq_data.get_len() (VllmWorkerProcess pid=489) ERROR 10-31 08:09:05 multiproc_worker_utils.py:229] AttributeError: 'NoneType' object has no attribute 'get_len'

Before submitting a new issue...

[X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

vllm-project / vllm