Closed Technologicat closed 1 month ago
EDIT: Main post updated. Thanks Cohee.
Could a seasoned user or dev chime in on these points that I couldn't find any documentation on? This is something that could be included in the user manual.
Bonus unrelated question:
Checkpoints are a named branch. A branch is created using one click only with an auto-generated name. Both do the same - clone the chat file at the designated point, hence they are manageable at the chat management interface. Bookmarks are the older name for checkpoints, Timelines (I call it that in plural) weren't updated to use a name. The wand menu is used for extensions, even if they are built-in (but still could be disabled). The burger menu is for core features, mostly chat or text generation related.
Thanks! This helps a lot. I updated the section on tooltips in my TODO list.
To avoid confusion, might be good to change the terminology in the Timelines extension. I suppose it needs no other updates right now than just changing a few "bookmark" to say "checkpoint" instead. And changing the title of the settings panel to "Timelines", in plural. And mentioning the /tl
(open timeline view) and /tl r
(refresh timeline graph) slash commands in the README - discovered those too by skimming the source code.
Particularly, /tl
makes for a nice Quick Reply button, to open the timeline view with a single click.
One remaining thing, I still don't understand why the details panel for a chat message in Timelines has the '→' button, since you can just open the message normally, and swipe / continue chatting to achieve the same effect.
~I'll see if I find the time to submit a PR for Timelines, too. :)~ EDIT: Sent, https://github.com/SillyTavern/SillyTavern-Timelines/pull/15. EDIT: As of 1 February 2024, merged.
EDIT: Turned out to be a red herring.
Wait, what?
SillyTavern/public/scripts/extensions/third-party/SillyTavern-Timelines/tl_node_data.js
has this check:
if (message.is_system && message.mes.includes('Bookmark created! Click here to open the bookmark chat')) return true;
But in SillyTavern/public/script.js
(e627e897), the actual message format is:
mes: 'Checkpoint created! Click here to open the checkpoint chat: <a class="bookmark_link" file_name="{0}" href="javascript:void(null);">{1}</a>',
so the check can't trigger.
I'll fix this while at it.
I'll fix this while at it.
This system message has been unused for long, even before a rename. There should be no chats where this is contained.
For those following along at home: fix reverted. The terminology fixes have been merged.
Regarding https://github.com/SillyTavern/SillyTavern/issues/1671, now there's a draft for a skeleton of RAG documentation in the first post.
Comments welcome, as well as where to put that and in what format. I suppose the user manual would eventually be the place, and Markdown the format?
Should we add a mention of the new embeddings
Extras module as an optional module under Vector Storage in the Extensions ⊳ Manage Extensions view?
(This UI is so large that I keep missing things.)
One more thing I don't understand at the moment: if I decrease the max context length in AI Response Configuration, from 8192 to 4096 [and no other changes anywhere], then Vector Storage silently breaks - no more RAG injections. In an initially empty chat, ST then also cuts off the prompt before the user message (that had the file attachment).
Then ST either hangs (nothing is generated), or the LLM produces a random reply (unrelated topic, and sometimes writing a question on the user's behalf). The latter is easy to understand: my character description instructs the LLM to roleplay an assistant, but it didn't get a user message to respond to.
Increasing max context back to 8192, sure enough, Vector Storage works again.
Nothing useful in the web browser console. Likewise in the ST server console.
Some function could be failing silently and prematurely terminating prompt generation?
Have to debug this, but there's a lot of code to look at.
I don't yet know which parts are relevant, except processFiles
in SillyTavern/public/scripts/extensions/vectors/index.js
could tell me if it actually got file chunks, and Generate
in SillyTavern/public/script.js
might be a good place to look in general.
I'd like to get this working also at smaller context sizes (while acknowledging the limitations), because 8192 is kind of pushing at the limits of an 8 GB GPU.
EDIT: Some clarifications.
Tracing the code...
Generate
calls runGenerationInterceptors
, defined in SillyTavern/public/scripts/extensions.js
SillyTavern/public/scripts/extensions/vectors/manifest.json
registers the interceptor vectors_rearrangeChat
rearrangeChat
in SillyTavern/public/scripts/extensions/vectors/index.js
processFiles
EDIT: Some investigation.
I think the inserted chunks just overflow your context and it couldn't proceed with the generation.
That is my hunch, too, but why doesn't it build a complete prompt, or produce any kind of error or warning message anywhere?
I'd understand if the RAG chunks were missing if they couldn't fit into the context length - though I'd like an explicit warning when this happens.
But the latest user message - a simple "How would you summarize the episodic memory theory of consciousness?" (to use Budson et al., 2022 for testing) - is missing too. It's like the latest turn in the conversation never started.
(Related to #1741, this particular PDF, with that particular question, is also prone to returning two identical copies of the abstract as the RAG chunks.)
I'll look into this a bit and see if we could at least add an explicit warning somewhere (maybe toastr
or at least console.log
).
EDIT: This comment is outdated.
Also, looking at Timelines:
makeTapTippy
in SillyTavern/public/scripts/extensions/third-party/SillyTavern-Timelines/index.js
, and navigateToMessage
in SillyTavern/public/scripts/extensions/third-party/SillyTavern-Timelines/tl_utils.js
.EDIT: PR submitted: https://github.com/SillyTavern/SillyTavern-Timelines/pull/16. Didn't touch Lock nodes for now, but added and improved tooltips, and added a Refresh Graph GUI button (same effect as /tl r
).
EDIT: As of 1 February 2024, https://github.com/SillyTavern/SillyTavern-Timelines/pull/16 has been merged.
@Technologicat I have idea/question about RAG feature: what if to mix it with WI? To work like this:
This give flexibility of WI with additional power from RAG. WI entry text in this case will work as the part of instruction - to tell AI what is in the storage, and what to do with information from it. (Example - WI entry text: "This is description of the old forest:", and attached file may contain description of the forest and what can be found in it, flora, fauna, places, etc..) Do you think it's possible to add?
@Cohee1207: What is your opinion about the suggestion by @valden80 above?
At first glance:
As for the query, I think it would be best to use the same query as for all other RAG sources (i.e. the latest chat messages), because the dynamically changing query is where the power of RAG comes from (i.e. it always looks for things relevant to the current context).
I like it when new features are implemented with minimal possible effort, but this seems a bit more elaborate, with a lot of moving parts going on behind the scenes. However, you could add a new metadata field to store the source of the RAGed part and use the same chat collection both for messages and all the connected lore (global, character, chat). Attaching files to WI is just an extra step - files are either way converted into text representation first, so you could theoretically add the ability to attach files by converting them to entry textual content first while maintaining perfect backward compatibility with older versions and other frontends. However, I don't know if that would be actually helpful as we're still far away from the land of the unicorns where you could just drop a wiki link and LLM would understand everything, but you're free to try it anyway.
Thanks for the input!
Yes, that is why I asked - sounds like this could require nontrivial changes. This is also not aligned with my current priorities, but on the other hand, sounds like it would have exciting potential for expanding the capabilities of the world info system.
It's also a logical extension of existing functionality. AFAIK, world info entries are similar to RAG, but using exact keyword search. This observation suggests two alternative ways to expand WI by RAG:
Stashing the info into the chat collection sounds a promising approach. But I'll have to take a closer look at the chat collection. So far I have no idea how anything except the actual messages are treated. I suppose I could start at tracing what Generate
does.
(To avoid confusing Timelines, it might be best to at least refrain from injecting "messages" that are not actually messages.)
Yeah, tools like RAG aren't perfect. Good search terms make a huge difference, whether it's a keyword or semantic search.
So... I suppose I won't be doing this now, but this could be interesting later. As usual, no promises.
This issue has gone 6 months without an update. To keep the ticket open, please indicate that it is still relevant in a comment below. Otherwise it will be closed in 7 days.
This issue was automatically closed because it has been stalled for over 6 months with no activity.
I have way too many TODO notes stored in various places, especially given that I'm a ~drive-by contributor~ guest developer? here, so it's better to manage them centrally in one place.
The list below is mostly what I'm personally interested in at the moment, either for work or hobby reasons. Suggestions are welcome, but I won't make promises which ones I'll accept, if any - quite simply I already have my plate full at least for a while.
Code:
insertVectorItems
inSillyTavern/public/scripts/extensions/vectors/index.js
calls/api/vectors/insert
, of which the essential parts are implemented byinsertVectorItems
inSillyTavern/src/endpoints/vectors.js
. This API call is monolithic in the sense that the fetch completes only after the whole file has been processed.getBatchVector
inSillyTavern/src/endpoints/vectors.js
.links=on
mode) the RAG treatment to avoid filling up the whole prompt.1744
1743
links=on
, and Websearch visits results pointing to scientific preprints on arxiv.org, it omits the actual abstract when retrieving anyarxiv.org/abs/...
page - which is specifically meant to show the abstract. Would be nice to figure out what's going on here. I need those abstracts in the results.Assets:
Tooltips:
{{summary}}
is the relevant tag.Documentation:
/th
(alias/talkinghead
) - switch Talkinghead mode of Character Expressions on/off/getchatname
- return the name of current chat file into the pipe, for use in scripts..jsonl
file extension./getchatname | /echo severity=info {{pipe}}
/trigger await=1
- like regular/trigger
, but wait for generation to finish before continuing the script./send Hello | /trigger await=1 | /echo Done
await=1
, the/echo
runs only after the AI finishes replying. Thus the AI's reply message will be in the chat at that time.await=1
) is provided in #1777, for summarizing the main point from a scientific abstract into one sentence, in three steps./chat-manager
(alias/chat-history
,/manage-chats
) - open the Manage chat files view{{model}}
- macro, replaced by the name of the connected LLMtalkinghead
.talkinghead
section of the ST user manual needs to be updated./emote
supporttalkinghead
character in Stable Diffusiontalkinghead
extension is an independent exploration of similar ideas in the context of animating the avatar of an AI character, based on a single static image.talkinghead
updates.embeddings
module in ST-extras significantly boosts Vector Storage's ingestion performance.embeddings
module.--embedding-model
command-line argument of the ST-extras server.embeddings
module uses the device (CPU or GPU) that you have configured your ST-extras to use. As usual, see the--cpu
(CPU, default),--cuda
(GPU) and--cuda-device
command-line arguments.chromadb
module in ST-extras, and thechromadb
Python library, are no longer needed.@Cohee1207: Is it possible to assign this issue to me? I'll keep it updated, and close when no longer relevant.