dibrale / ceruleus

Local LLM-powered external information retrieval with with text-generation-webui
GNU Affero General Public License v3.0
7 stars 1 forks source link

Ceruleus

Let chat agents set their own goals and take initiative in conversation.

Description

Ceruleus provides a back-end 'internal monologue' for chat agents by implementing flexible loops of goal setting, question formulation, information retrieval and appraisal with natural points for optional user intervention. Information retrieval and summarization is implemented using Squire, while the primary user interaction occurs via oobabooga's text-generation-webui. The API of the latter is also used for high level decision-making by the agent.

Moving beyond the traditional prompt and response interaction model, chat agents running Ceruleus take initiative in conversation, looking up information online and generating relevant messages autonomously on the basis of this information. Ceruleus makes this possible by prompting the language model for optional speech after the completion of an information analysis loop. In addition, information gathered by an agent in the course of its internal process - as well as the history of its successes and failures - is available to it via the 'internal' entry of the text-generation-webui chat log.

Additional data buffers for answers, goals and thoughts are used for internal processing by the agent, representing a 'subconscious' layer of memory that is not directly accessible by the LLM responsible for generating speech. The speech itself is produced in conjunction with goal-setting and question-asking in order to ensure good semantic coupling between the agent's speech and intent. However, portions of the Ceruleus loop also give the chat agent an opportunity to set goals and pose new questions intrinsically, without access to the conversation log.

Given the limitations of openly available LLMs and the amount of additional processing performed by the agent in conjunction with the generation of speech, autonomous speech output is carefully groomed within Ceruleus by the sequential application of regular expressions. While the modular code structure of Ceruleus allows arbitrary LLM-powered steps to be easily introduced in order to further improve output quality, adjusting the behavior of a chat agent running Ceruleus requires little to no knowledge of coding.

The Ceruleus back-end can be monitored live and easily interrogated using a GUI. This graphical interface includes a suite of tools providing insight into the 'thought process' of the agent. It also allows for easy process monitoring and control, as well as the export of data for offline analysis. The Ceruleus GUI also offers easy access to all the prompt templates used by the back-end for LLM-powered steps, enabling real-time prompt engineering.

Design Notes

Software Requirements

Installation

1. Clone this repository and navigate into its directory. i.e.:

git clone https://github.com/dibrale/ceruleus.git
cd ceruleus

2. Install dependencies, i.e. using

pip install -r requirements.txt

3. Clone Squire, i.e. using

git clone https://github.com/dibrale/squire.git

4. Install Squire dependencies, i.e. using

pip install -r squire/requirements.txt

5. Back up your character chat log, i.e.

mkdir backups
cp text-generation-webui/logs/<character_name>_persistent.json backups/<character_name>_persistent.json.bak

Ceruleus edits the chat log directly when it runs.


Note: The GUI starts immediately after the server using the current start.sh script, so parameter changes made via the GUI will not be reflected until the next run.


6. Open params.json in the root directory of Ceruleus and edit the parameters to suit your machine. These are detailed below.

7. Start Ceruleus using the provided script, i.e.

   ./start.sh

This will start the Ceruleus server in the background with output directed to logfile.log, then start the GUI in the foreground. To stop the back-end, make note of the process ID in the output and kill -9 that process.

Parameters and Usage

Edit the params.json file before running Ceruleus to reflect your setup, with particular attention to CUDA_VISIBLE_DEVICES, char_card_path, char_log_path and squire_model_path. The full list of parameters is described below.

Parameter Type Default Description
script_name String ceruleus The name of the script as it appears on every line of terminal output.
host String localhost Preferred name of the local host. This may need to be changed to '127.0.0.1' on some machines, or to other addresses in the case of exotic setups
port Integer 1230 Port on which to run the server API
verbose Boolean false Set to enable additional terminal output for debugging.
CUDA_VISIBLE_DEVICES String 0 Comma-separated list of all CUDA devices that should be visible to Ceruleus (eg. use '0,1' if you have two GPUs and want both to be detectable). Passes the shell variable of the same name on execution of external scripts.
results_dir String results Path to results directory.
work_dir String results Path to work directory.
template_dir String templates Path to templates directory.
char_card_path String char.json Path to the character file Ceruleus is to use, eg. text_generation_webui/characters/<character_name>.json
char_log_path String char_persistent.json Path to the conversation log file Ceruleus is to use, eg. text_generation_webui/logs/<character_name>_persistent.json
squire_path String squire Path to the directory where squire.py is located.
squire_out_dir String squire_output Path to the directory where Squire will write its output. This directory will be monitored for text file activity, and any text file altered within that directory will be processed as an answer string
model_path String ggml-model-q5_1.bin Path to the model weights to be used when running Squire. Only CPU inference with llama.cpp is supported at this time, so this should be a *.bin file.
telesend Boolean false Set to write goals in data_visible of the persistent conversation log instead of just in data.
retry_delay Integer 10 Retry delay for web UI API calls, in seconds
ping_interval Integer 15 Ping interval for websockets, in seconds
ping_timeout Integer 60 Ping timeout for websockets, in seconds
answer_attempts_max Integer 2 Maximum number of times to run Squire on the same question before reappraisal
internal_str String internal Key of JSON entry for invisible data in conversation log
visible_str String visible Key of JSON entry for visible data in conversation log

Included in this parameter file is another object containing parameters used by the appraisal code LLM. These are described in detail elsewhere.

Parameter Type Default
n_ctx Integer 1800
top_p Float 0.8
top_k Integer 30
repeat_penalty Float 1.1
temperature Float 0.4
n_batch Integer 700
n_threads Integer 8
n_gpu_layers Integer 10

Additional Llama parameters that can be passed by LangChain to llama.cpp can be included in this object. The final object in the parameter file contains all the parameters that can be passed to text-generation-webui. See the repository of that project for details regarding these.

Operation


Note: A number of convenience features are absent from the open version of Ceruleus at the time of this writing. Contact ADMC Science and Consulting via email if you are interested in the priority implementation of features that suit your needs.


Precautions

GUI Instructions

The Ceruleus GUI was written using PySimpleGUI. It opens with start.sh, but can be opened on its own with python ceruleus_client.py. This can be useful for parameter editing before starting the software. Once open, the GUI will display a status bar and five tabs: Parameters, Controls, Status, Log and Results.


Parameters Tab

From the parameters tab, you can view, modify, delete and save parameters for the back-end.

params.png

Controls Tab

The controls tab allows you to connect to a Ceruleus instance, pause and unpause the Ceruleus loop and touch off the script.

controls.png


Status Tab

The status tab shows a record of subtask execution with respect to time and allows this data to be exported. So long as data recording is enabled, subtask data will update regardless of whether the tab is active. status.png


Log Tab

The log tab allows for monitoring or analysis of a Ceruleus logfile, with basic filtering if desired.

log.png


Results Tab

The results tab allows for the viewing and modification of template, result and work files.

results.png

No-GUI Instructions

Afterword

I hope that this tool enhances your LLM-powered chat agents, and you find it both useful and simple to use. Please do not hesitate to contact me via this repository if you have any questions or encounter any issues with the software. ADMC Science and Consulting would be happy to further tailor Ceruleus to your needs and provide priority support. Contact us via email for details!