[Bug]: Error when running multimodal large models with --enable-prefix-caching

Your current environment

The output of `python collect_env.py`

```text When I was using vllm to launch the Qwen2-VL model service, I configured the parameter --enable-prefix-caching. An error occurred when the service requested the second image. It seems that during the current use, this parameter is not compatible with multimodal large models. Do we have plans to fix this bug for compatibility in the future? ``` ![下载](https://github.com/user-attachments/assets/0b61ae2d-5e32-4895-be4b-a84749d94646)

🐛 Describe the bug

When I was using vllm to launch the Qwen2-VL model service, I configured the parameter --enable-prefix-caching. An error occurred when the service requested the second image. It seems that during the current use, this parameter is not compatible with multimodal large models. Do we have plans to fix this bug for compatibility in the future?

Before submitting a new issue...

[X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

vllm-project / vllm