oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
40.22k stars 5.28k forks source link

KatyTheCutie_EstopianMaid-13B problem with long character context #6159

Open Butterfly-Dragon opened 3 months ago

Butterfly-Dragon commented 3 months ago

Describe the bug

apparently character context is not included in the chat context as neither compress_pos_emb nor alpha_value seem to have any effect with this model. So while i have tested the model to produce high quality results outside of long context characters it is impossible to reproduce if the context is longer than a few thousand tokens.

Introducing the same context (and past interactions) as part of a prompt produces the desired results.

the following is the result with either compress_pos_emb or alpha_value set at 8 to counter the normal 4K context of the model with a character with a context of about 9K tokens.

the model works fine with Assistant or the baseline Example character.

Is there an existing issue for this?

Reproduction

use KatyTheCutie_EstopianMaid-13B as a model have a chat over 4k tokens (until it says truncated) change compress_pos_emb or alpha_value to 8x use past chat as context and try to keep talking

Screenshot

No response

Logs

05:09:26-490721 ERROR    Failed to build the chat prompt. The input is too long for the available context length.

                         Truncation length: 4096
                         max_new_tokens: 512 (is it too high?)
                         Available context length: 3584

Traceback (most recent call last):
  File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\queueing.py", line 566, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\route_utils.py", line 261, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\blocks.py", line 1786, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\blocks.py", line 1350, in call_function
    prediction = await utils.async_iteration(iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\utils.py", line 583, in async_iteration
    return await iterator.__anext__()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\utils.py", line 576, in __anext__
    return await anyio.to_thread.run_sync(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\anyio\_backends\_asyncio.py", line 2144, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\utils.py", line 559, in run_sync_iterator_async
    return next(iterator)
           ^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\utils.py", line 742, in gen_wrapper
    response = next(iterator)
               ^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\modules\chat.py", line 406, in generate_chat_reply_wrapper
    for i, history in enumerate(generate_chat_reply(text, state, regenerate, _continue, loading_message=True, for_ui=True)):
  File "D:\Documents\GitHub\text-generation-webui\modules\chat.py", line 374, in generate_chat_reply
    for history in chatbot_wrapper(text, state, regenerate=regenerate, _continue=_continue, loading_message=loading_message, for_ui=for_ui):
  File "D:\Documents\GitHub\text-generation-webui\modules\chat.py", line 318, in chatbot_wrapper
    prompt = generate_chat_prompt(text, state, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\text-generation-webui\modules\chat.py", line 223, in generate_chat_prompt
    raise ValueError
ValueError

System Info

Microsoft Windows [Version 10.0.19045.4529]
i9-9900K coffee lake
64GB DDR4 @ 1200MHz
Geforce RTX 2080 8GB
Butterfly-Dragon commented 2 months ago

any help?