anon998 / simple-proxy-for-tavern

GNU Affero General Public License v3.0
110 stars 6 forks source link

Documentation for using SuperBig? #6

Open Charuru opened 1 year ago

Charuru commented 1 year ago

Some instructions on setup and installation please? I already have this proxy working for Kobold+SillyTavern.

Thanks.

anon998 commented 1 year ago

It's not really in a working state right now. But here are more or less the instructions, at least for Linux:

Create virtual environment with venv or Conda:

python -m venv venv
source ./venv/bin/activate
conda create -n superbig
conda activate superbig

Grab the requirements file from here and install them: https://github.com/kaiokendev/superbig/blob/master/requirements.txt

pip install -r requirements.txt
pip install superbig

Run the script and enable it in the config:

python src/basic-superbig-api.py
Charuru commented 1 year ago

Thanks, if I do this what should I expect since it's not in a working state? Should I even bother or wait for updates?

anon998 commented 1 year ago

I think you should wait unless you want to tinker with the code yourself.  Right now, if the prompt is bigger than the maximum context size, it will send the chatlog to that superbig API along with your last message. It will then return a list of fragments sorted by relevancy that are then added to the prompt a couple of messages from the bottom. You would need to explicitly use similar words, and even then, it might confuse the events from the relevant messages as being current events. The messages it returns are truncated too, so it will miss the end of long messages.

Charuru commented 1 year ago

Thanks again. I am up for some amount of tinkering but I wouldn't want to do anything that the SuperBig dev is going to do already.

Do you think it'll work as intended if these issues are resolved? For example, it should be possible to hack it to return whole sentences instead of fragments. It should be possible to mark the origin of those fragments.

For my usecase I want to write a very long background description, say 3000 tokens worth of context. I would want to mark any fragment returned as being "this is from the background description" vs "this is from earlier in the chatlog". Actually if it can support longer descriptions it would already be worth using even if it doesn't work well for the chatlog.

The similar words might be a big issue though.

anon998 commented 1 year ago

I haven't finished reading the source code, so right now I don't know how to return the whole message instead of a fragment or how to control the amount of information returned. Right now, it only supports one source at a time, but you could just send the character description in a different call. In the issues of the superbig repo the dev talks a bit about what he's planning to do.