oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
40.23k stars 5.28k forks source link

OpenAI API reply TypeError #2852

Closed Pb-207 closed 1 year ago

Pb-207 commented 1 year ago

Describe the bug

ERROR encountered while replying other front ends (e.g. langflow): TypeError: must be str, not list Seems to be something wrong with stream-reply

Full log: 2023-06-25 03:20:37 INFO:Loaded the model in 14.14 seconds.

2023-06-25 03:20:37 INFO:Loading the extension "openai"... 2023-06-25 03:20:38 INFO:Loading the extension "gallery"... Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().

Loaded embedding model: all-mpnet-base-v2, max sequence length: 384 Starting OpenAI compatible api: OPENAI_API_BASE=http://127.0.0.1:5001/v1 127.0.0.1 - - [25/Jun/2023 03:21:11] "POST /v1/chat/completions HTTP/1.1" 200 - Host: 127.0.0.1:5001 X-OpenAI-Client-User-Agent: {"bindings_version": "0.27.8", "httplib": "requests", "lang": "python", "lang_version": "3.10.8", "platform": "Windows-10-10.0.22621-SP0", "publisher": "openai", "uname": "Windows 10 10.0.22621 AMD64"} User-Agent: OpenAI/v1 PythonBindings/0.27.8 Authorization: Bearer sk-12345 Content-Type: application/json Accept: / Accept-Encoding: gzip, deflate Content-Length: 413

{'messages': [{'role': 'user', 'content': 'The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n\nCurrent conversation:\n\nHuman: Hello\nAI:'}], 'model': 'gpt-3.5-turbo', 'max_tokens': None, 'stream': True, 'n': 1, 'temperature': 0.7} Loaded instruction role format: Vicuna-v0 {'prompt': "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\n\n### Human: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n\nCurrent conversation:\n\nHuman: Hello\nAI:\n### Assistant:", 'req_params': {'max_new_tokens': 1024, 'temperature': 0.7, 'top_p': 1.0, 'top_k': 1, 'repetition_penalty': 1.18, 'encoder_repetition_penalty': 1.0, 'suffix': None, 'stream': True, 'echo': False, 'seed': -1, 'truncation_length': 2048, 'add_bos_token': True, 'do_sample': True, 'typical_p': 1.0, 'epsilon_cutoff': 0.0, 'eta_cutoff': 0.0, 'tfs': 1.0, 'top_a': 0.0, 'min_length': 0, 'no_repeat_ngram_size': 0, 'num_beams': 1, 'penalty_alpha': 0.0, 'length_penalty': 1.0, 'early_stopping': False, 'mirostat_mode': 0, 'mirostat_tau': 5.0, 'mirostat_eta': 0.1, 'ban_eos_token': False, 'skip_special_tokens': True, 'custom_stopping_strings': ['\n###', '\n### Human:', '### Human:']}} Exception ignored in: <generator object generate_reply_custom at 0x000001D4CFDD4F90> Traceback (most recent call last): File "D:\PyCharm Community Edition 2022.2.3\Projects\text-generation-webui\modules\text_generation.py", line 327, in generate_reply_custom print(f'Output generated in {(t1-t0):.2f} seconds ({new_tokens/(t1-t0):.2f} tokens/s, {new_tokens} tokens, context {original_tokens}, seed {seed})') ZeroDivisionError: float division by zero

Exception occurred during processing of request from ('127.0.0.1', 52167) Traceback (most recent call last): File "D:\AnaConda\envs\NLP\lib\socketserver.py", line 683, in process_request_thread self.finish_request(request, client_address) File "D:\AnaConda\envs\NLP\lib\socketserver.py", line 360, in finish_request self.RequestHandlerClass(request, client_address, self) File "D:\AnaConda\envs\NLP\lib\socketserver.py", line 747, in init self.handle() File "D:\AnaConda\envs\NLP\lib\http\server.py", line 432, in handle self.handle_one_request() File "D:\AnaConda\envs\NLP\lib\http\server.py", line 420, in handle_one_request method() File "D:\PyCharm Community Edition 2022.2.3\Projects\text-generation-webui\extensions\openai\script.py", line 472, in do_POST for a in generator: File "D:\PyCharm Community Edition 2022.2.3\Projects\text-generation-webui\modules\text_generation.py", line 23, in generate_reply for result in _generate_reply(*args, **kwargs): File "D:\PyCharm Community Edition 2022.2.3\Projects\text-generation-webui\modules\text_generation.py", line 210, in _generate_reply reply, stop_found = apply_stopping_strings(reply, all_stop_strings) File "D:\PyCharm Community Edition 2022.2.3\Projects\text-generation-webui\modules\text_generation.py", line 148, in apply_stopping_strings idx = reply.find(string) TypeError: must be str, not list

Is there an existing issue for this?

Reproduction

flags: --chat --gpu-memory 22 --model Vicuna-33B-GPTQ-4bit --trust-remote-code --loader exllama --extensions openai

webui was working well.

Screenshot

No response

Logs

2023-06-25 03:20:37 INFO:Loaded the model in 14.14 seconds.

2023-06-25 03:20:37 INFO:Loading the extension "openai"...
2023-06-25 03:20:38 INFO:Loading the extension "gallery"...
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.

Loaded embedding model: all-mpnet-base-v2, max sequence length: 384
Starting OpenAI compatible api:
OPENAI_API_BASE=http://127.0.0.1:5001/v1
Output generated in 2.19 seconds (12.33 tokens/s, 27 tokens, context 43, seed 951727781)
127.0.0.1 - - [25/Jun/2023 03:21:11] "POST /v1/chat/completions HTTP/1.1" 200 -
Host: 127.0.0.1:5001
X-OpenAI-Client-User-Agent: {"bindings_version": "0.27.8", "httplib": "requests", "lang": "python", "lang_version": "3.10.8", "platform": "Windows-10-10.0.22621-SP0", "publisher": "openai", "uname": "Windows 10 10.0.22621 AMD64"}
User-Agent: OpenAI/v1 PythonBindings/0.27.8
Authorization: Bearer sk-12345
Content-Type: application/json
Accept: */*
Accept-Encoding: gzip, deflate
Content-Length: 413

{'messages': [{'role': 'user', 'content': 'The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n\nCurrent conversation:\n\nHuman: Hello\nAI:'}], 'model': 'gpt-3.5-turbo', 'max_tokens': None, 'stream': True, 'n': 1, 'temperature': 0.7}
Loaded instruction role format: Vicuna-v0
{'prompt': "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\n\n### Human: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n\nCurrent conversation:\n\nHuman: Hello\nAI:\n### Assistant:", 'req_params': {'max_new_tokens': 1024, 'temperature': 0.7, 'top_p': 1.0, 'top_k': 1, 'repetition_penalty': 1.18, 'encoder_repetition_penalty': 1.0, 'suffix': None, 'stream': True, 'echo': False, 'seed': -1, 'truncation_length': 2048, 'add_bos_token': True, 'do_sample': True, 'typical_p': 1.0, 'epsilon_cutoff': 0.0, 'eta_cutoff': 0.0, 'tfs': 1.0, 'top_a': 0.0, 'min_length': 0, 'no_repeat_ngram_size': 0, 'num_beams': 1, 'penalty_alpha': 0.0, 'length_penalty': 1.0, 'early_stopping': False, 'mirostat_mode': 0, 'mirostat_tau': 5.0, 'mirostat_eta': 0.1, 'ban_eos_token': False, 'skip_special_tokens': True, 'custom_stopping_strings': ['\n###', '\n### Human:', '### Human:']}}
Exception ignored in: <generator object generate_reply_custom at 0x000001D4CFDD4F90>
Traceback (most recent call last):
  File "D:\PyCharm Community Edition 2022.2.3\Projects\text-generation-webui\modules\text_generation.py", line 327, in generate_reply_custom
    print(f'Output generated in {(t1-t0):.2f} seconds ({new_tokens/(t1-t0):.2f} tokens/s, {new_tokens} tokens, context {original_tokens}, seed {seed})')
ZeroDivisionError: float division by zero
----------------------------------------
Exception occurred during processing of request from ('127.0.0.1', 52167)
Traceback (most recent call last):
  File "D:\AnaConda\envs\NLP\lib\socketserver.py", line 683, in process_request_thread
    self.finish_request(request, client_address)
  File "D:\AnaConda\envs\NLP\lib\socketserver.py", line 360, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "D:\AnaConda\envs\NLP\lib\socketserver.py", line 747, in __init__
    self.handle()
  File "D:\AnaConda\envs\NLP\lib\http\server.py", line 432, in handle
    self.handle_one_request()
  File "D:\AnaConda\envs\NLP\lib\http\server.py", line 420, in handle_one_request
    method()
  File "D:\PyCharm Community Edition 2022.2.3\Projects\text-generation-webui\extensions\openai\script.py", line 472, in do_POST
    for a in generator:
  File "D:\PyCharm Community Edition 2022.2.3\Projects\text-generation-webui\modules\text_generation.py", line 23, in generate_reply
    for result in _generate_reply(*args, **kwargs):
  File "D:\PyCharm Community Edition 2022.2.3\Projects\text-generation-webui\modules\text_generation.py", line 210, in _generate_reply
    reply, stop_found = apply_stopping_strings(reply, all_stop_strings)
  File "D:\PyCharm Community Edition 2022.2.3\Projects\text-generation-webui\modules\text_generation.py", line 148, in apply_stopping_strings
    idx = reply.find(string)
TypeError: must be str, not list

System Info

OS: Windows 11
GPU: Nvidia Geforce RTX 4090 24G * 1
Pb-207 commented 1 year ago

@matatonic

rdkilla commented 1 year ago

same error after updatingt his morning

matatonic commented 1 year ago

Did you update recently? This seems related to new changes in the stopping string perhaps.

Or perhaps something else, I can't tell where that floating point division by zero is from.

probably needs other eyes, but I don't think this is happening because of anything openai specific, it's happening in generate_reply.

@oobabooga

matatonic commented 1 year ago

Ok, hrm, I think I see what might be up. ast.literal_eval(f"[{state['custom_stopping_strings']}]"

Inside the state - that should just be a list, not a python string - right? In the blocking_api I think that is a string, but in the code it should just be a list - Seemt to me like that ast.literal_eval() should be before it ends up in the state, not after.

IMO, we should just remove all that ast eval stuff and let json handle the array of strings.

rdkilla commented 1 year ago

i fixed divide by zero with print(f'Output generated in {(t1-t0 + 1e-9):.2f} seconds ({new_tokens/(t1-t0 + 1e-9):.2f} tokens/s, {new_tokens} tokens, context {original_tokens}, seed {seed})')

adding small variable, still get the string / list error though

Pb-207 commented 1 year ago

Did you update recently? This seems related to new changes in the stopping string perhaps.

Or perhaps something else, I can't tell where that floating point division by zero is from.

probably needs other eyes, but I don't think this is happening because of anything openai specific, it's happening in generate_reply.

@oobabooga

Yes i just updated to the latest commit then meet this error. Thx for your reply.

Pb-207 commented 1 year ago

i fixed divide by zero with print(f'Output generated in {(t1-t0 + 1e-9):.2f} seconds ({new_tokens/(t1-t0 + 1e-9):.2f} tokens/s, {new_tokens} tokens, context {original_tokens}, seed {seed})')

adding small variable, still get the string / list error though

this seems not the major problem, i've changed "t1-t0" to 1, the first error disappeared, but typeerror still remains.

matatonic commented 1 year ago

That PR may fix the API (not tested yet, but it may break other things still)

matatonic commented 1 year ago

Hrm, maybe I need to remove something from the api after all, maybe I had been working around what was fixed today. Testing some changes now.

Pb-207 commented 1 year ago

That PR may fix the API (not tested yet, but it may break other things still)

Astonishing speed. 👍🏻 Thx a lot.

matatonic commented 1 year ago

This should fix it, I've just removed all reference to custom_stopping_strings, I think I was using it incorrectly anyways and had a partial fix already.

https://github.com/oobabooga/text-generation-webui/pull/2849

Still not quite right ... working.

matatonic commented 1 year ago

Ok, that should do it, I needed to call an exorcist. Good now I think, please test the PR.