LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.94k stars 349 forks source link

{Feature Request}: Summarization of previous chat single or multiple AI bots. #1132

Open SPn00b opened 1 week ago

SPn00b commented 1 week ago

Describe the Feature Summarize previous chats to make space for context while remembering previous important events and actions and leave recent 2-3 responses as is. example ->https://github.com/kwaroran/RisuAI/wiki/SupaMemory and https://perchance.org/ai-chat#edit

Additional Information: Edition Windows 11 Pro Version 23H2 OS build 22631.4169 Experience Windows Feature Experience Pack 1000.22700.1034.0 Processor 12th Gen Intel(R) Core(TM) i3-12100F 3.30 GHz Installed RAM 16.0 GB (15.9 GB usable) System type 64-bit operating system, x64-based processor Pen and touch No pen or touch input is available for this display GPU: GTX 1050 Ti 4GB KoboldCpp Version: koboldcpp-1.75.2 on CUDA 12.1 using cuBLAS

Please provide as much relevant information about your setup as possible, such as the Operating System, CPU, GPU, KoboldCpp Version, and relevant logs (helpful to include the launch params from the terminal output, flags and crash logs)

LostRuins commented 1 week ago

Have you tried Autogenerate Memory image

SPn00b commented 1 week ago

Okay anyway to auto trigger it when context limit reaches to just fit one output length or single response before context shifting removes previous tokens? Currently I'm using context shifting.

LostRuins commented 1 week ago

Needs to be done manually