Closed YaoJiayi closed 2 weeks ago
This is likely to be a python import problem.
I met with this problem before, and worked around it by pip install ./ (without -e) every time.
I found that the same module(vllm.worker.model_runner) is imported TWICE when installing with -e, and thus making monkey patching fail.
According to my three experiments, it works well. Check if you installed vllm using pip install . -e
. That may cause the problem. All my experiments are done using pip install vllm
.
Besides, could you activate lmcache_vllm using lmcache_vllm serve
in your terminal?
Using
vllm serve
command cannot activate lmcache. Usingentrypoints
for now.pypi install
works fine but notpip install -e .
Created an issue at #12