wawawario2 / long_term_memory

A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, LLaMA, and Pygmalion.
GNU Affero General Public License v3.0
308 stars 15 forks source link

First time using plugin. "missing 4 required positional arguments 'name1', 'name2', 'context', and 'chat_prompt_size' #19

Closed practical-dreamer closed 1 year ago

practical-dreamer commented 1 year ago

Really cool looking extension. Trying to get it to work for first time, running into error. Could use assistance diagnosing further.

I followed the instructions to the best of my ability.

  1. git clone https://github.com/wawawario2/long_term_memory extensions/long_term_memory
  2. pip install -r extensions/long_term_memory/requirements.txt
  3. python -m pytest -v extensions/long_term_memory/

Here's my test results

(textgen3) user@user-System-Product-Name:~/Documents/ooba/3/text-generation-webui$ python -m pytest -v extensions/long_term_memory/
=========================== test session starts ============================
platform linux -- Python 3.10.9, pytest-7.2.2, pluggy-1.0.0 -- /home/user/anaconda3/envs/textgen3/bin/python
cachedir: .pytest_cache
rootdir: /home/user/Documents/ooba/3/text-generation-webui
plugins: anyio-3.6.2
collected 9 items                                                          

extensions/long_term_memory/core/_test/test_memory_database.py::test_typical_usage PASSED [ 11%]
extensions/long_term_memory/core/_test/test_memory_database.py::test_duplicate_messages PASSED [ 22%]
extensions/long_term_memory/core/_test/test_memory_database.py::test_inconsistent_state PASSED [ 33%]
extensions/long_term_memory/core/_test/test_memory_database.py::test_extended_usage PASSED [ 44%]
extensions/long_term_memory/core/_test/test_memory_database.py::test_reload_embeddings_from_disk PASSED [ 55%]
extensions/long_term_memory/core/_test/test_memory_database.py::test_destroy_fake_memories PASSED [ 66%]
extensions/long_term_memory/core/_test/test_memory_database.py::test_multi_fetch PASSED [ 77%]
extensions/long_term_memory/utils/_test/test_chat_parsing.py::test_clean_character_message PASSED [ 88%]
extensions/long_term_memory/utils/_test/test_timestamp_parsing.py::test_get_time_difference_message PASSED [100%]

============================= warnings summary =============================
extensions/long_term_memory/core/_test/test_memory_database.py::test_typical_usage
  /home/user/.local/lib/python3.10/site-packages/pkg_resources/__init__.py:121: DeprecationWarning: pkg_resources is deprecated as an API
    warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)

extensions/long_term_memory/core/_test/test_memory_database.py::test_typical_usage
  /home/user/.local/lib/python3.10/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
====================== 9 passed, 2 warnings in 26.12s ======================

Here's my entire log

(textgen3) user@user-System-Product-Name:~/Documents/ooba/3/text-generation-webui$ python server.py --wbits 4 --gpu-memory 15 19 --listen --model llama-65b --verbose --chat --extension long_term_memory

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: /home/user/anaconda3/envs/textgen3/lib/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /home/user/.local/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
Loading settings from settings.json...
Loading llama-65b...
Found the following quantized model: models/llama-65b-4bit.safetensors
Loading model ...
/home/user/.local/lib/python3.10/site-packages/safetensors/torch.py:99: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  with safe_open(filename, framework="pt", device=device) as f:
/home/user/.local/lib/python3.10/site-packages/torch/_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
/home/user/.local/lib/python3.10/site-packages/torch/storage.py:899: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  storage = cls(wrap_storage=untyped_storage)
Done.
Using the following device map for the 4-bit model: {'model.embed_tokens': 0, 'model.layers.0': 0, 'model.layers.1': 0, 'model.layers.2': 0, 'model.layers.3': 0, 'model.layers.4': 0, 'model.layers.5': 0, 'model.layers.6': 0, 'model.layers.7': 0, 'model.layers.8': 0, 'model.layers.9': 0, 'model.layers.10': 0, 'model.layers.11': 0, 'model.layers.12': 0, 'model.layers.13': 0, 'model.layers.14': 0, 'model.layers.15': 0, 'model.layers.16': 0, 'model.layers.17': 0, 'model.layers.18': 0, 'model.layers.19': 0, 'model.layers.20': 0, 'model.layers.21': 0, 'model.layers.22': 0, 'model.layers.23': 0, 'model.layers.24': 0, 'model.layers.25': 0, 'model.layers.26': 0, 'model.layers.27': 0, 'model.layers.28': 0, 'model.layers.29': 0, 'model.layers.30': 0, 'model.layers.31': 0, 'model.layers.32': 0, 'model.layers.33': 0, 'model.layers.34': 0, 'model.layers.35': 0, 'model.layers.36': 1, 'model.layers.37': 1, 'model.layers.38': 1, 'model.layers.39': 1, 'model.layers.40': 1, 'model.layers.41': 1, 'model.layers.42': 1, 'model.layers.43': 1, 'model.layers.44': 1, 'model.layers.45': 1, 'model.layers.46': 1, 'model.layers.47': 1, 'model.layers.48': 1, 'model.layers.49': 1, 'model.layers.50': 1, 'model.layers.51': 1, 'model.layers.52': 1, 'model.layers.53': 1, 'model.layers.54': 1, 'model.layers.55': 1, 'model.layers.56': 1, 'model.layers.57': 1, 'model.layers.58': 1, 'model.layers.59': 1, 'model.layers.60': 1, 'model.layers.61': 1, 'model.layers.62': 1, 'model.layers.63': 1, 'model.layers.64': 1, 'model.layers.65': 1, 'model.layers.66': 1, 'model.layers.67': 1, 'model.layers.68': 1, 'model.layers.69': 1, 'model.layers.70': 1, 'model.layers.71': 1, 'model.layers.72': 1, 'model.layers.73': 1, 'model.layers.74': 1, 'model.layers.75': 1, 'model.layers.76': 1, 'model.layers.77': 1, 'model.layers.78': 1, 'model.layers.79': 1, 'model.norm': 1, 'lm_head': 1}
Loaded the model in 11.82 seconds.
Loading the extension "long_term_memory"... 
-----------------------------------------
IMPORTANT LONG TERM MEMORY NOTES TO USER:
-----------------------------------------
Please remember that LTM-stored memories will only be visible to the bot during your NEXT session. This prevents the loaded memory from being flooded with messages from the current conversation which would defeat the original purpose of this module. This can be overridden by pressing 'Force reload memories'
----------
LTM CONFIG
----------
change these values in ltm_config.json
{'ltm_context': {'injection_location': 'BEFORE_NORMAL_CONTEXT',
                 'memory_context_template': "{name2}'s memory log:\n"
                                            '{all_memories}\n'
                                            'During conversations between '
                                            '{name1} and {name2}, {name2} will '
                                            'try to remember the memory '
                                            'described above and naturally '
                                            'integrate it with the '
                                            'conversation.',
                 'memory_template': '{time_difference}, {memory_name} said:\n'
                                    '"{memory_message}"'},
 'ltm_reads': {'max_cosine_distance': 0.6,
               'memory_length_cutoff_in_chars': 1000,
               'num_memories_to_fetch': 2},
 'ltm_writes': {'min_message_length': 100}}
----------
-----------------------------------------
Ok.
Loading the extension "gallery"... Ok.
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
  File "/home/user/anaconda3/envs/textgen3/lib/python3.10/site-packages/gradio/routes.py", line 393, in run_predict
    output = await app.get_blocks().process_api(
  File "/home/user/anaconda3/envs/textgen3/lib/python3.10/site-packages/gradio/blocks.py", line 1108, in process_api
    result = await self.call_function(
  File "/home/user/anaconda3/envs/textgen3/lib/python3.10/site-packages/gradio/blocks.py", line 929, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/user/.local/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/user/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/home/user/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/home/user/anaconda3/envs/textgen3/lib/python3.10/site-packages/gradio/utils.py", line 490, in async_iteration
    return next(iterator)
  File "/home/user/Documents/ooba/3/text-generation-webui/modules/chat.py", line 218, in cai_chatbot_wrapper
    for history in chatbot_wrapper(text, state):
  File "/home/user/Documents/ooba/3/text-generation-webui/modules/chat.py", line 146, in chatbot_wrapper
    prompt = custom_generate_chat_prompt(text, state, **kwargs)
TypeError: custom_generate_chat_prompt() missing 4 required positional arguments: 'name1', 'name2', 'context', and 'chat_prompt_size'
practical-dreamer commented 1 year ago

I found the solution here: https://github.com/oobabooga/text-generation-webui/pull/1055

Edit: Just verified ooba's patch works. I won't close this issue as it's still outstanding but it's pretty conclusive what the issue was...

digiwombat commented 1 year ago

Here's a gist with the script.py patch applied for easy copy-pasting for anyone who wants it while waiting:

https://gist.github.com/digiwombat/3a22427b822114091cb4895eb92489d8

Tested working on latest pull as of this writing for both LTM and webui.

wawawario2 commented 1 year ago

Thanks for the report all, this should now be fixed.

digiwombat commented 1 year ago

Pulled. Working for me with the latest webui commit. 👍🏻