codelion / optillm

Optimizing inference proxy for LLMs
Apache License 2.0
1.6k stars 128 forks source link

I get the following error: list index out of range #67

Open ErykCh opened 1 month ago

ErykCh commented 1 month ago

Hi,

in LibreChat connected to optillm proxy served by docker I get following error:

2024-10-21 06:56:48 error: [handleAbortError] AI response error; aborting request: 500 "list index out of range"

prompt:

bon|moa|mcts|cot_reflection

There are two hippos in front of a some hippo, two hippos behind a some hippo and a some hippo in the middle. How many hippos are there?

and most other approaches.

It looks like optillm is returning messages that are not compliant with the OpenAI API standard.

ErykCh commented 1 month ago

And there is one more error in logs:

2024-10-21 07:39:53 error: [OpenAIClient] Known OpenAI error: Error: missing role for choice 0 2024-10-21 07:44:46 warn: [OpenAIClient.chatCompletion][stream] API error

but when there is this error, LibreChat displays a message, but it would also be good to correct it

ErykCh commented 1 month ago

This error is connected with moa:

moa

There are two hippos in front of a some hippo, two hippos behind a some hippo and a some hippo in the middle. How many hippos are there?

and error is connected with vllm as a model inference:

2024-10-21 08:12:42,987 - INFO - Received request to /v1/chat/completions 2024-10-21 08:12:43,047 - INFO - Using approach(es) ['moa'], operation SINGLE, with model Qwen2.5 2024-10-21 08:12:54,009 - INFO - HTTP Request: POST http://localhost:8000/v1/chat/completions "HTTP/1.1 200 OK" 2024-10-21 08:12:54,011 - ERROR - Error processing request: list index out of range 2024-10-21 08:12:54,011 - INFO - 10.155.25.104 - - [21/Oct/2024 08:12:54] "POST /v1/chat/completions HTTP/1.1" 500 - 2024-10-21 08:12:54,517 - INFO - Received request to /v1/chat/completions 2024-10-21 08:12:54,568 - INFO - Using approach(es) ['moa'], operation SINGLE, with model Qwen2.5 2024-10-21 08:13:01,626 - INFO - HTTP Request: POST http://localhost:8000/v1/chat/completions "HTTP/1.1 200 OK" 2024-10-21 08:13:01,627 - ERROR - Error processing request: list index out of range 2024-10-21 08:13:01,627 - INFO - 10.155.25.104 - - [21/Oct/2024 08:13:01] "POST /v1/chat/completions HTTP/1.1" 500 - 2024-10-21 08:13:02,473 - INFO - Received request to /v1/chat/completions 2024-10-21 08:13:02,528 - INFO - Using approach(es) ['moa'], operation SINGLE, with model Qwen2.5 2024-10-21 08:13:10,743 - INFO - HTTP Request: POST http://localhost:8000/v1/chat/completions "HTTP/1.1 200 OK" 2024-10-21 08:13:10,745 - ERROR - Error processing request: list index out of range 2024-10-21 08:13:10,745 - INFO - 10.155.25.104 - - [21/Oct/2024 08:13:10] "POST /v1/chat/completions HTTP/1.1" 500 -

codelion commented 1 month ago

Does vllm support returning multiple responses from the /v1/chat/completions end point? For moa we get 3 generations from the model https://github.com/codelion/optillm/blob/95cc14d1505c76fe124f56b233a864a971760893/optillm/moa.py#L16 Can you try with another technique like cot_reflection do you get the same error? Unfortunately, I cannot test vllm locally as I am on mac m3 and it doesn't support it (https://github.com/vllm-project/vllm/issues/2081)

ErykCh commented 1 month ago

Yes, it returns.

cot_reflection is ok mcts is ok (but I will create another ticket, it seems that is problem with mcts configuration)

codelion commented 1 month ago

Do you have the same problem (running moa) with another model? Or are you calling it with the right chat template?

Error: missing role for choice 0 Because this error will come if the response message doesn’t have a user role.

ErykCh commented 1 month ago

I was sending too many different approaches at once (bon|moa|mcts|cot_reflection). So to sort it out.

Problem with vllm and error (following error is in OptiLLM logs) Error processing request: list index out of range refers to moa

Problem with what optillm returns to LibreChat (following error is in LibreChat logs) Known OpenAI error: Error: missing role for choice 0 refers to mcts, bon, cot_reflection and I've tried z3 now and it also has this error, luckily the answer itself appears in the chat for all of them

ErykCh commented 1 month ago

Known OpenAI error: Error: missing role for choice 0

Direct connection from LibreChat to vLLM doesn't cause such an error

codelion commented 1 month ago

Known OpenAI error: Error: missing role for choice 0

Direct connection from LibreChat to vLLM doesn't cause such an error

This particular error looks like a known issue with LibreChat - https://github.com/danny-avila/LibreChat/discussions/1222

ErykCh commented 1 month ago

Ok, so only 'list index out of range' left.

codelion commented 1 month ago

Ok, so only 'list index out of range' left.

For that can please run moa with vllm with the changes I made here - https://github.com/codelion/optillm/commit/0df52917a44fd1c9f200e8bef082e3b465e64511

I added more logging to help figure out where it is failing.

ErykCh commented 1 month ago

I have --log debug

but I don't see all debug logs image

ErykCh commented 1 month ago

So this is problem with vLLM it return 1 choice instead of 3 but parameter is visible in vLLM logs

image

codelion commented 1 month ago

Yeah, vllm is not returning 3 responses, can you get the 3 responses if you set n directly in your openai client and call vllm?

codelion commented 2 weeks ago

We will implement a fall back for this in #83