'NoneType' object has no attribute 'headers' (completions endpoint)

runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.

MIT License

242 stars 96 forks source link

'NoneType' object has no attribute 'headers' (completions endpoint) #104

Closed Permafacture closed 1 month ago

Permafacture commented 2 months ago

When trying to use the completions endpoint (rather than chat_completions) on a vLLM runpod serverless instance I get a server error. This happens with all models that I've tried. The chat_completions endpoint works as expected.

This example from the vLLM quick start shows the issue

from openai import OpenAI

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "<MyApiKey>"
openai_api_base = "https://api.runpod.ai/v2/<endpoint_id>/openai/v1"
client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)
completion = client.completions.create(model="Phi-3-small-8k-instruct",
                                      prompt="San Francisco is a")
print("Completion result:", completion)

On the client side I get a 500 Error response. On the Server I can see the error is 'NoneType' object has no attribute 'headers'

This is using the most recent vLLM 0.5.4

Permafacture commented 2 months ago

Update I installed vLLM 0.5.4 locally through pip and did not have this issue. It's specific to the runpod worker

ericflo commented 2 months ago

Seeing this too

Juhong-Namgung commented 2 months ago

Same issue in vLLM 0.5.3 (runpod/worker-v1-vllm:stable-cuda12.1.0)

TimPietrusky commented 2 months ago

@Permafacture thanks for reporting this problem.

We are working on resolving this, I will keep you updated!

prashantjoshi22 commented 1 month ago

Is the issue resolved ?

jamorell commented 1 month ago

Same issue in vLLM 0.5.4 on runpod. Any news about this?

TimPietrusky commented 1 month ago

@ericflo @naaviii12345 @jamorell @Permafacture @Juhong-Namgung @prashantjoshi22

We just released a bug fix: runpod/worker-v1-vllm:v1.3.1dev-cuda12.1.0. Can you please check if this is working for you?

Permafacture commented 1 month ago

How do I try this on Runpod? I've only used the quick deploy for the vllm worker

pandyamarut commented 1 month ago

This has been fixed.

prashantjoshi22 commented 1 month ago

Thanks team :-)

On Thu, 19 Sept 2024 at 03:48, Marut Pandya @.***> wrote:

This has been fixed.

— Reply to this email directly, view it on GitHub https://github.com/runpod-workers/worker-vllm/issues/104#issuecomment-2359501394, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADIQDET2OSA4ZO6OMGBV2H3ZXH34ZAVCNFSM6AAAAABNAK7LBGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJZGUYDCMZZGQ . You are receiving this because you were mentioned.Message ID: @.***>