apparently character context is not included in the chat context as neither compress_pos_emb nor alpha_value seem to have any effect with this model. So while i have tested the model to produce high quality results outside of long context characters it is impossible to reproduce if the context is longer than a few thousand tokens.
Introducing the same context (and past interactions) as part of a prompt produces the desired results.
the following is the result with either compress_pos_emb or alpha_value set at 8 to counter the normal 4K context of the model with a character with a context of about 9K tokens.
the model works fine with Assistant or the baseline Example character.
Is there an existing issue for this?
[X] I have searched the existing issues
Reproduction
use KatyTheCutie_EstopianMaid-13B as a model
have a chat over 4k tokens (until it says truncated)
change compress_pos_emb or alpha_value to 8x
use past chat as context and try to keep talking
Screenshot
No response
Logs
05:09:26-490721 ERROR Failed to build the chat prompt. The input is too long for the available context length.
Truncation length: 4096
max_new_tokens: 512 (is it too high?)
Available context length: 3584
Traceback (most recent call last):
File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\queueing.py", line 566, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\route_utils.py", line 261, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\blocks.py", line 1786, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\blocks.py", line 1350, in call_function
prediction = await utils.async_iteration(iterator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\utils.py", line 583, in async_iteration
return await iterator.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\utils.py", line 576, in __anext__
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\anyio\_backends\_asyncio.py", line 2144, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\utils.py", line 559, in run_sync_iterator_async
return next(iterator)
^^^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\installer_files\env\Lib\site-packages\gradio\utils.py", line 742, in gen_wrapper
response = next(iterator)
^^^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\modules\chat.py", line 406, in generate_chat_reply_wrapper
for i, history in enumerate(generate_chat_reply(text, state, regenerate, _continue, loading_message=True, for_ui=True)):
File "D:\Documents\GitHub\text-generation-webui\modules\chat.py", line 374, in generate_chat_reply
for history in chatbot_wrapper(text, state, regenerate=regenerate, _continue=_continue, loading_message=loading_message, for_ui=for_ui):
File "D:\Documents\GitHub\text-generation-webui\modules\chat.py", line 318, in chatbot_wrapper
prompt = generate_chat_prompt(text, state, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Documents\GitHub\text-generation-webui\modules\chat.py", line 223, in generate_chat_prompt
raise ValueError
ValueError
System Info
Microsoft Windows [Version 10.0.19045.4529]
i9-9900K coffee lake
64GB DDR4 @ 1200MHz
Geforce RTX 2080 8GB
Describe the bug
apparently character context is not included in the chat context as neither
compress_pos_emb
noralpha_value
seem to have any effect with this model. So while i have tested the model to produce high quality results outside of long context characters it is impossible to reproduce if the context is longer than a few thousand tokens.Introducing the same context (and past interactions) as part of a prompt produces the desired results.
the following is the result with either
compress_pos_emb
oralpha_value
set at 8 to counter the normal 4K context of the model with a character with a context of about 9K tokens.the model works fine with Assistant or the baseline Example character.
Is there an existing issue for this?
Reproduction
use KatyTheCutie_EstopianMaid-13B as a model have a chat over 4k tokens (until it says truncated) change
compress_pos_emb
oralpha_value
to 8x use past chat as context and try to keep talkingScreenshot
No response
Logs
System Info