[Misc]: _run_workers_async function of DistributedGPUExecutorAsync

I am confused why _run_workers_async function of DistributedGPUExecutorAsync is removed since v0.4.3?

New implementation starts a loop for every worker which will restrict worker from doing other things such as transfering kv cache in prefill/decode disaggregation. I use _run_workers_async to transfer kv cache before without any problems but it will execute only when the loops of workers are stopped currently.

I am sorry that I am not familiar with asyncio in python. I want to know what the benefits of the new implementation are? And how to allow the workers to transfer kv asynchronously during generation?

vllm-project / vllm

[Misc]: _run_workers_async function of DistributedGPUExecutorAsync #6400