Technologicat's TODO list

I have way too many TODO notes stored in various places, especially given that I'm a ~drive-by contributor~ guest developer? here, so it's better to manage them centrally in one place.

The list below is mostly what I'm personally interested in at the moment, either for work or hobby reasons. Suggestions are welcome, but I won't make promises which ones I'll accept, if any - quite simply I already have my plate full at least for a while.

Code:
- Add drag'n'drop support for attaching files.
- Use case: I often have a file manager open, and it would be nice to just drop a file from there instead of manually navigating to the same directory in Attach a File, and then re-checking from the file manager what the filename was (in case I had them sorted in some order other than alphabetical).
- There is already a drag'n'drop handler somewhere, but it refuses PDF files, although you can attach them just fine.
- Vector Storage: UX: Figure out why it's failing silently instead of producing an error message when the context length is too small. (See separate comment below.)
- Vector Storage: Add a progress bar to the document ingestion process.
- The ST server internally sends the chunks in batches of 10.
- But the issue here is that insertVectorItems in SillyTavern/public/scripts/extensions/vectors/index.js calls /api/vectors/insert, of which the essential parts are implemented by insertVectorItems in SillyTavern/src/endpoints/vectors.js. This API call is monolithic in the sense that the fetch completes only after the whole file has been processed.
- So if we want a progress bar, we need to modify the API - one API endpoint to dispatch the vectorization request and return a unique ID for that request, and another API endpoint taking in that unique ID, to monitor a request's status. The status would be managed by getBatchVector in SillyTavern/src/endpoints/vectors.js.
- I really like nice UX, so maybe I should post another ticket and suggest this...
- What we can do easily right now is show a message when the vectorization completes, but before the LLM starts processing the prompt.
- Vector Storage: Give Websearch results (in links=on mode) the RAG treatment to avoid filling up the whole prompt.
- Give the RAG treatment to any chat message longer than a threshold?
- Vector Storage: Add a mention of the new embeddings Extras module as an optional module under Vector Storage in the Extensions ⊳ Manage Extensions view.
- Manage chat files: Ask Cohee if it's ok to make this search field a swoop (fragment search), too, or to add that as an option. (The Timelines search currently is a swoop.)
- Question and working prototype code posted as #1797
- Manage chat files: Would be nice if this panel closed itself when clicking outside it, like Timelines does.
- 1744
- 1743
- Talkinghead: GUI panel (at client side, in Character Expressions settings) for configuring the per-character animator and postprocessor settings.
- Vector Storage: Preprocessor hook.
- Would be very useful for things like stripping reference lists from scientific articles, since those tend to have the highest concentration of keywords of any kind (and hence have a tendency to poison top-k RAG queries).
- Or maybe just a "Strip reference lists (for scientific papers)" option for now?
- Websearch: arXiv abstracts missing.
- When links=on, and Websearch visits results pointing to scientific preprints on arxiv.org, it omits the actual abstract when retrieving any arxiv.org/abs/... page - which is specifically meant to show the abstract. Would be nice to figure out what's going on here. I need those abstracts in the results.
- Websearch: Discuss with Cohee if a feature to retrieve a given URL would be desirable.
- I'd use it to summarize or find information at a particular web page for which I already have the URL (say, a blog post, some API documentation, or an arXiv abstract page).
- This would separate the search from the retrieval and analysis.
- Websearch: I particularly like Startpage for their privacy-first focus, so I'd like to add that as an engine on the ST-extras side. But I'll have to check their TOS first, as well as the feasibility of parsing their particular result format.
- See the Talkinghead TODO. Still many things remaining there.
- But first check latest posts in https://github.com/SillyTavern/SillyTavern-Extras/issues/206 to see if it's been done already. The Talkinghead TODO only updates when backend changes are made (since it's hosted in the Extras repo).
Assets:
- Talkinghead needs some example characters. It's easy to create new characters - yes, so show, don't tell.
- An AI assistant as another default character card?
- Idea posted as #1805.
- Fix the chair leg in the cyberpunk bedroom with some inpainting?
Tooltips:
- Tooltip improvements vol. 1 has been merged. Let's go for vol. 2.
- Summarize extension needs a tooltip for the insertion template, particularly that {{summary}} is the relevant tag.
- The User settings ⊳ Swipes tooltip could be improved. Right now it doesn't say that if you delete all chat messages below a message that has swipes, the swipes are still there, and you can pick a different one. How to word that compactly?
- The Prompt message action tooltip could be "Prompt itemization report", for clarity. Bare "Prompt" sounds like a verb; prompt what?
- Author's note panel ('≡' menu ⊳ Author's Note) is missing a tooltip for the Depth setting in insertion position (i.e. how many messages from curren end of chat; could copy the tooltip from the Summarize or Vector Storage config panels.)
- The Manage chat files dialog is missing some tooltips for buttons.
- The wand menu tooltip should be just "Extensions", no longer just from Extras, but including the built-ins.
- A branch is just an auto-named checkpoint, otherwise they are the same thing. Indicate this in the tooltips.
- There's now a section on checkpoints and branches in the Timelines README update that's queued with the UX improvements. This could be adapted to the main user manual, too.
- ~Timeline extension needs tooltips for its settings.~ EDIT: Done, https://github.com/SillyTavern/SillyTavern-Timelines/pull/16 and https://github.com/SillyTavern/SillyTavern-Timelines/pull/18
Documentation:
- New slash commands and macros (1.11.3, 1.11.4):
- /th (alias /talkinghead) - switch Talkinghead mode of Character Expressions on/off
  - This is convenient as a Quick Reply button, to save GPU resources (especially on a laptop) if you know you'll be AFK for a while.
  - Talkinghead now auto-pauses when SillyTavern is hidden, so you might not need this slash command - but due to the way web browsers detect whether the page is visible, that requires either switching to another tab in the same browser window, or minimizing the whole browser window. The slash command is useful when you leave the browser window open, with the SillyTavern tab active.
- /getchatname - return the name of current chat file into the pipe, for use in scripts.
  - The name is returned without the .jsonl file extension.
  - E.g. /getchatname | /echo severity=info {{pipe}}
- /trigger await=1 - like regular /trigger, but wait for generation to finish before continuing the script.
  - Works also in streaming mode. The AI's reply appears bit by bit as usual. When the AI finishes generating, the script continues running.
  - E.g. /send Hello | /trigger await=1 | /echo Done
  - Due to the await=1, the /echo runs only after the AI finishes replying. Thus the AI's reply message will be in the chat at that time.
  - A more complex example script (with and without the new await=1) is provided in #1777, for summarizing the main point from a scientific abstract into one sentence, in three steps.
  - Useful for running scripts with multiple processing steps, where the later steps need the AI to reply to previous ones first.
- /chat-manager (alias /chat-history, /manage-chats) - open the Manage chat files view
  - E.g. you can bind this to a Quick Reply button to give one-click access to the chat manager.
- {{model}} - macro, replaced by the name of the connected LLM
  - Useful for an AI assistant, e.g. if you upgrade the LLM in your local ooba backend often.
  - The value is updated when ST connects to the backend API, typically at the beginning of the ST session.
  - In the character card, e.g. "You are an AI Large Language Model, version {{model}}."
- The new, rewritten talkinghead.
- The README still needs pictures, particularly of the effects of the different postprocessor filters.
- The talkinghead section of the ST user manual needs to be updated.
- Until then:
  - See the new Talkinghead README.
  - We still use the THA3 AI posing engine [1] [2], but pretty much everything else has been revised, rewritten, or extended.
  - Highlights:
  - New pixel-space postprocessor, for things like scifi hologram or 1980s VHS tape effects
  - /emote support
  - Talking animation while LLM is streaming text (talking animation no longer needs TTS)
  - New much more natural-looking idle animation
  - Much better performance on GPU (think ~40-60 FPS on RTX 3070 Ti laptop GPU)
  - Framerate-corrected animation, like in games (high-end hardware can make it smoother, doesn't speed it up)
  - Both server-wide and per-character configuration
  - Emotion template editor GUI app (runs on wxPython), can also generate static expression sprites
  - Instructions for making a talkinghead character in Stable Diffusion
  - Limitations:
  - No Live2D or VRM compatibility. The talkinghead extension is an independent exploration of similar ideas in the context of animating the avatar of an AI character, based on a single static image.
  - Some features do not work in Visual Novel Mode or in group chats. This extension focuses on 1-on-1 interaction with a single AI character for now.
  - For stuff still missing, see the Talkinghead TODO.
  - https://github.com/SillyTavern/SillyTavern-Extras/issues/206 acts as a temporary devblog where I post info on talkinghead updates.
- Retrieval Augmented Generation (RAG).
- The Vector Storage builtin extension needs a section in the ST user manual.
  - @Botoni has requested RAG documentation, and I agree ST needs this. While much of the userbase of ST doesn't care about RAG, it's a power user thing, so some of us do!
  - Essentially, RAG in ST in 15 minutes:
  - RAG in a nutshell:
    - What RAG is: Essentially, a smart search engine for a collection of text files (and/or chat history), with the NLP capabilities of an LLM.
    - Similar to: World Info / lorebook entries - but RAG triggers on semantic latent space matches, not on exact keywords; and also the entries are created automatically from text input, rather than curated manually.
    - Especially useful for: Fact-heavy Q&A on one or more documents that are too long to fit into the context window.
    - NOT useful for: Silver bullet for "infinite context size" for storywriting (why, see below).
  - How RAG works, in general:
    - You feed one or more documents into the RAG system, which ingests them into its database.
    - ST can also feed the chat history into the database.
    - Then you can ask the RAG system specific questions about the information contained in its database.
    - One RAG query consists of two steps, which run automatically in the following order:
    - Retrieval: The RAG system searches its database for matches relevant to your question, and retrieves relevant chunks of text.
    - Generation: These chunks are automatically injected into the LLM prompt, and then the LLM composes the answer for you.
    - Original paper that introduced the concept: Lewis et al. (2020, revised 2021): Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
  - Where to find RAG in ST:
    - Settings at Extensions ⊳ Vector Storage (it is a built-in extension)
    - Can be enabled separately for files (text file attachments) and chat messages:
    - Files is for Q&A on one or more documents. This can be highly useful. TL;DR is not an issue for a computer program.
    - Chat messages is for a very limited form of long-term memory for AI characters. Whenever you explicitly mention some specific topic, your AI character will see the best search matches on that topic from all of the chat history so far, regardless of whether that information has already scrolled out of the context window. And that's it. You must explicitly mention a thing to retrieve "memories" of it.
    - You can configure how many latest chat messages to consider as the RAG query (setting: Query messages), and how many RAG search results (at most) to inject into the prompt (setting: Retrieve chunks). Note that each individual search result can be rather lengthy, depending on the Chunk size (chars) setting.
    - Keep in mind the RAG retrieval results must fit into the LLM's context window (along with everything else).
    - In ST, each chat (roughly speaking) keeps its own RAG database.
    - Strictly speaking, as of SillyTavern 1.11.3 'staging' (3cf01e76):
      - The RAG database is organized into collections.
      - The chat history of one chat file (in the sense of the Manage chat files view) is one collection.
      - As for chat branching, keep in mind that creating a branch just creates a new chat file, copying some of the content of the original.
      - One file attachment is one collection.
      - Each unique file is ingested only once, even across several chats. Since the content of the RAG collection for that file is always the same, the single copy is automatically shared.
      - Currently file identity is detected from the filename.
      - The current chat file has RAG access to:
      - Its own chat history collection, and
      - The collection of any attached file that is attached to any message in the chat history.
      - When Vector Storage is off, "attached file" means exactly that. The entire text content (but not the filename) of the attachment is essentially dumped right there into the chat message. The user sees a file attachment on the chat message in the GUI; the AI instead sees the text content of the file.
      - When Vector Storage is on, and enabled for files, then upon sending a message to the AI:
      - The chat history is scanned for messages that have a file attachment.
      - Any file whose text content is shorter than Size threshold (KB) remains inlined to the chat message (as far as the AI is concerned).
      - Any file attachment whose text content is at least Size threshold (KB) gets the RAG treatment.
        
        In the prompt sent to the AI, the inlined file content of the "large" file is omitted, and replaced by RAG-retrieved chunks that match the query.
        
        "query" = the combined text of as many most recent chat messages as set in Extensions ⊳ Vector Storage ⊳ Query messages.
        
        The same latest query applies also to attachments further up in the chat history! Keep in mind what the AI is doing: it is responding to you using all available information available to it. So the information the AI needs now - regardless of where in the chat history a file attachment is placed - depends on what you are discussing now. Thus, if you're debugging your prompts, and old RAG retrieval results seem to change as the chat continues, this is why - the query has changed.
        
        The chunks are injected into the chat message that has the file attachment. In a long chat, that message might have scrolled past the context limit - if so, the AI won't see the retrieved chunks.
      - When Vector Storage is on, and enabled for chat messages, then upon sending a message to the AI:
      - The latest Retain# messages are always kept in the prompt, no matter what.
      - All older messages in the history of the current chat are matched using the RAG system with the current query.
        
        The Insert# best-matching chunks are injected - in order of decreasing relevance - to the Injection Position specified in Vector Storage settings.
        
        The chat messages those chunks came from are omitted, to reduce duplication. (Otherwise they could end up in the prompt, if they have not yet scrolled past the context limit.)
      - When Vector Storage is on, and enabled for both files and chat messages, both of the above apply; the files are treated first.
    - To use RAG on one or more files, attach a file to your chat message:
    - wand menu ⊳ Attach a file (supported formats: at least TXT, PDF)
    - When you send the chat message that includes the attachment, the RAG system ingests the file into its database before your message is sent to the LLM.
      - The file is automatically chunked, and these chunks become the values of the database entries. Keys... see below.
      - After a file has been ingested, all messages in the same chat have access to its content.
      - If the attachment is later deleted, or the message is removed from the chat, the file is no longer seen by that chat. (As per the details above.)
    - IMPORTANT: You can ask your first question already when you attach the file.
      - The AI never sees the filename or file metadata - it sees only the file content. Also, if the file is large enough to get the RAG treatment, then at any one time, the AI only sees those chunks of the file that the RAG retrieval step determined were relevant in the context of your query.
      - So strictly speaking, it doesn't make sense to send a message like "Here's a document, please read it", because the reading happens automatically, behind the scenes. The AI has no idea that there ever was a "document" (as a thing separate from chat messages) - although being an LLM, it might notice that what you "typed" looks suspiciously like a "document", and word its reply accordingly.
      - Particularly, for scientific use as a literature combing assistant: when you feed in a scientific paper, the AI has no idea about authors, year, paper title, ...so you'll want to ask specific questions about the content of the paper. Example:
      - ❌ "What do Blah et al. (20xx) say about LLMs?"
        
        "Blah et al." probably did not end up in the database, and "LLMs" is too generic to trigger good-quality matches.
      - ✅ "What are the key contributions of the RWKV LLM architecture?"
        
        Many AI papers talk about "key contributions" (which should also be semantically close to "results" and similar topics), and for a paper talking specifically about RWKV, asking about the architecture of that should turn up matches in a semantic full-text search.
      - When in doubt, for debugging, look at the generated prompt in the ST server console window to see the raw text chunks that the retrieval step found.
  - RAG database keys are sentence embeddings, thereby enabling semantic search.
    - Sentence embedding = a relative of word embedding, but for longer pieces of text, such as sentences or entire paragraphs.
    - This embedding process is also known as vectorization; a database that uses such embeddings as keys is known as a vector database (hence extension name Vector Storage).
  - IMPORTANT: To re-emphasize the most important point: RAG is essentially a search engine. Your query must explicitly mention the thing you're looking for. Exact wording is not important, but you must mention something that has a meaning roughly similar to the thing you're looking for.
    - In other words: RAG will NOT magically keep the story so far implicitly alive; it will only search for things that you mention explicitly in your query!
    - "query" = the combined text of as many most recent chat messages as set in Extensions ⊳ Vector Storage ⊳ Query messages.
    - Finally, when wording your query, keep in mind that you're talking not only to the retrieval system, but also to the LLM.
    - If nothing in the database matches, as of early 2024 an LLM won't say this - it will confidently use its built-in knowledge, thus producing output that is prone to hallucinations.
    - Very specific things that the LLM itself doesn't know much about, but which are discussed in your documents (or in the chat history, in that mode), are good candidates for search topics.
- New embeddings module in ST-extras significantly boosts Vector Storage's ingestion performance.
  - It computes embeddings on all CPU cores in parallel, or optionally on the GPU (if you have the VRAM).
  - To enable it:
  - Make sure your ST-extras is up to date.
  - Configure your ST-extras server to load the embeddings module.
    - Optionally, you can choose a custom text embedding model just as before, using the --embedding-model command-line argument of the ST-extras server.
    - The embeddings module uses the device (CPU or GPU) that you have configured your ST-extras to use. As usual, see the --cpu (CPU, default), --cuda (GPU) and --cuda-device command-line arguments.
    - The chromadb module in ST-extras, and the chromadb Python library, are no longer needed.
  - Restart your ST-extras server.
  - Restart SillyTavern, just to be safe.
  - In the ST GUI: Extensions ⊳ Vector Storage ⊳ Vectorization Source: choose Extras
  - Done! From now on, Vector Storage will compute its embeddings using the fast Extras provider.

@Cohee1207: Is it possible to assign this issue to me? I'll keep it updated, and close when no longer relevant.

EDIT: Main post updated. Thanks Cohee.

Could a seasoned user or dev chime in on these points that I couldn't find any documentation on? This is something that could be included in the user manual.

What is the difference between checkpoints and branches in ST?
- EDIT: A checkpoint is just a named branch, they are otherwise the same.
How do checkpoints and/or branches relate to the Timeline extension?
- If you create a checkpoint, that shows up as a bookmark in Timeline, and the relevant path in the timeline graph is colored (as the Timeline README mentions for bookmarks).
- Are Timeline bookmarks and ST checkpoints the same thing? If so, the Timeline documentation and GUI should say "checkpoint" instead of "bookmark".
- EDIT: Yes, ST checkpoints used to be called bookmarks. They are the same thing.
- Is it Timeline or Timelines? After you install the extension, the settings panel under Extensions says "Timeline", but the GitHub repo name is "Timelines".
- EDIT: It's Timelines.
Is there some way to delete / rename / manage checkpoints? I can imagine them cluttering the timeline legend quite quickly...
- EDIT: Yes: checkpoints are just chat files. Use the Manage chat files view.
In Timeline, in what situation would I want to press Branch from... (the '→' button), instead of just normally opening the chat message (by clicking its title/timestamp, next to the '→' button), and then replying or swiping as normal? Unless I've missed something, the latter approach seems to create a tree structure just fine.

Bonus unrelated question:

What is the difference in intent between the '≡' and wand menus?
- The wand menu tooltip says Extras Extensions, but that is no longer accurate, because Attach File belongs to the built-in Vector Storage extension. Translate Chat also seems to be a built-in thing nowadays. Should those items be in the '≡' menu instead?
- EDIT: Nowadays, the wand menu is for extensions, whether builtin or external. It's generally for stuff that could in principle be disabled in the settings. The '≡' menu is for core features, mostly related to chat or text generation.

Checkpoints are a named branch. A branch is created using one click only with an auto-generated name. Both do the same - clone the chat file at the designated point, hence they are manageable at the chat management interface. Bookmarks are the older name for checkpoints, Timelines (I call it that in plural) weren't updated to use a name. The wand menu is used for extensions, even if they are built-in (but still could be disabled). The burger menu is for core features, mostly chat or text generation related.

Thanks! This helps a lot. I updated the section on tooltips in my TODO list.

To avoid confusion, might be good to change the terminology in the Timelines extension. I suppose it needs no other updates right now than just changing a few "bookmark" to say "checkpoint" instead. And changing the title of the settings panel to "Timelines", in plural. And mentioning the /tl (open timeline view) and /tl r (refresh timeline graph) slash commands in the README - discovered those too by skimming the source code.

Particularly, /tl makes for a nice Quick Reply button, to open the timeline view with a single click.

One remaining thing, I still don't understand why the details panel for a chat message in Timelines has the '→' button, since you can just open the message normally, and swipe / continue chatting to achieve the same effect.

~I'll see if I find the time to submit a PR for Timelines, too. :)~ EDIT: Sent, https://github.com/SillyTavern/SillyTavern-Timelines/pull/15. EDIT: As of 1 February 2024, merged.

EDIT: Turned out to be a red herring.

Wait, what?

SillyTavern/public/scripts/extensions/third-party/SillyTavern-Timelines/tl_node_data.js has this check:

        if (message.is_system && message.mes.includes('Bookmark created! Click here to open the bookmark chat')) return true;

But in SillyTavern/public/script.js (e627e897), the actual message format is:

            mes: 'Checkpoint created! Click here to open the checkpoint chat: <a class="bookmark_link" file_name="{0}" href="javascript:void(null);">{1}</a>',

so the check can't trigger.

I'll fix this while at it.

I'll fix this while at it.

This system message has been unused for long, even before a rename. There should be no chats where this is contained.

For those following along at home: fix reverted. The terminology fixes have been merged.

Regarding https://github.com/SillyTavern/SillyTavern/issues/1671, now there's a draft for a skeleton of RAG documentation in the first post.

Comments welcome, as well as where to put that and in what format. I suppose the user manual would eventually be the place, and Markdown the format?

Should we add a mention of the new embeddings Extras module as an optional module under Vector Storage in the Extensions ⊳ Manage Extensions view?

(This UI is so large that I keep missing things.)

One more thing I don't understand at the moment: if I decrease the max context length in AI Response Configuration, from 8192 to 4096 [and no other changes anywhere], then Vector Storage silently breaks - no more RAG injections. In an initially empty chat, ST then also cuts off the prompt before the user message (that had the file attachment).

Then ST either hangs (nothing is generated), or the LLM produces a random reply (unrelated topic, and sometimes writing a question on the user's behalf). The latter is easy to understand: my character description instructs the LLM to roleplay an assistant, but it didn't get a user message to respond to.

Increasing max context back to 8192, sure enough, Vector Storage works again.

Nothing useful in the web browser console. Likewise in the ST server console.

Some function could be failing silently and prematurely terminating prompt generation?

Have to debug this, but there's a lot of code to look at.

I don't yet know which parts are relevant, except processFiles in SillyTavern/public/scripts/extensions/vectors/index.js could tell me if it actually got file chunks, and Generate in SillyTavern/public/script.js might be a good place to look in general.

I'd like to get this working also at smaller context sizes (while acknowledging the limitations), because 8192 is kind of pushing at the limits of an 8 GB GPU.

EDIT: Some clarifications.

Tracing the code...

Generate calls runGenerationInterceptors, defined in SillyTavern/public/scripts/extensions.js
SillyTavern/public/scripts/extensions/vectors/manifest.json registers the interceptor vectors_rearrangeChat
Implementation is rearrangeChat in SillyTavern/public/scripts/extensions/vectors/index.js
If file mode is enabled, it calls processFiles

EDIT: Some investigation.

I think the inserted chunks just overflow your context and it couldn't proceed with the generation.

That is my hunch, too, but why doesn't it build a complete prompt, or produce any kind of error or warning message anywhere?

I'd understand if the RAG chunks were missing if they couldn't fit into the context length - though I'd like an explicit warning when this happens.

But the latest user message - a simple "How would you summarize the episodic memory theory of consciousness?" (to use Budson et al., 2022 for testing) - is missing too. It's like the latest turn in the conversation never started.

(Related to #1741, this particular PDF, with that particular question, is also prone to returning two identical copies of the abstract as the RAG chunks.)

I'll look into this a bit and see if we could at least add an explicit warning somewhere (maybe toastr or at least console.log).

EDIT: This comment is outdated.

Also, looking at Timelines:

Lock nodes seems useless.
- It's a Cytoscape thing that prevents chosen nodes from moving during an animated layout, intended mainly for use with force-directed layout algorithms. EDIT: It also prevents nodes from being moved manually by dragging.
- If the setting is enabled, all nodes are locked when a render starts.
- However, regardless of the setting, all nodes are unlocked when you expand a node with swipes (click and hold), and once the expand animation completes, all nodes are locked again.
- I couldn't find evidence of any other layout animations taking place.
Sketching missing tooltips here:
- Fixed tooltip: "When you click on a node, always show the node content at the bottom left. When off, show it near the node clicked."
- The non-fixed mode could use some TLC. I find that it often covers adjacent nodes, getting in the way of navigating the graph. But in the fixed mode, the data is visually rather far from the part being navigated, necessitating glancing back and forth. Optimal would be near the node, but not covering any important stuff. Definition of "important stuff" may be tricky, which is why this is what it is...
- Avatar as root: "Show the AI character's avatar as the root node. When off, show the root node as blank."
- Show legend: "Show a legend for colors corresponding to different characters and chat checkpoints."
- Scale nodes with swipes: "When enabled, nodes that have swipes will appear subtly larger."
- Use UI Theme: "Use the colors of the ST GUI theme, instead of the colors configured specifically for this extension."
The difference between the "main button" (chat file title) and the "branch button" ('→') is that... well, pressing the "branch button" '→' creates a new branch.
- The difference in behavior only clearly manifests at a non-leaf node (i.e. not the latest message in a chat):
- The main button opens the original chat file, including any later messages. It attempts to focus the chat view on the chosen message.
- The branch button splits off the chat at the chosen node, allowing you to take the chat into a new direction from that point on. It copies any history up to and including the chosen node, but any later nodes in the original chat are not copied into the new branch (they remain in the original chat).
- See makeTapTippy in SillyTavern/public/scripts/extensions/third-party/SillyTavern-Timelines/index.js, and navigateToMessage in SillyTavern/public/scripts/extensions/third-party/SillyTavern-Timelines/tl_utils.js.
- Therefore, tooltips:
- Main button (chat file title): "Find this message in the original chat, and open it."
- Branch button ('→'): "Create a new branch starting at this message, and open it."

EDIT: PR submitted: https://github.com/SillyTavern/SillyTavern-Timelines/pull/16. Didn't touch Lock nodes for now, but added and improved tooltips, and added a Refresh Graph GUI button (same effect as /tl r). EDIT: As of 1 February 2024, https://github.com/SillyTavern/SillyTavern-Timelines/pull/16 has been merged.

@Technologicat I have idea/question about RAG feature: what if to mix it with WI? To work like this:

File can be attached to WI entry, and ingested to vector storage.
When this WI triggered - it will be processed as normal, but after the normal WI entry text it will also attach request from vector storage based on context of attached file.

This give flexibility of WI with additional power from RAG. WI entry text in this case will work as the part of instruction - to tell AI what is in the storage, and what to do with information from it. (Example - WI entry text: "This is description of the old forest:", and attached file may contain description of the forest and what can be found in it, flora, fauna, places, etc..) Do you think it's possible to add?

@Cohee1207: What is your opinion about the suggestion by @valden80 above?

At first glance:

This would essentially enable semantic search in lorebooks.
I think we would need some core changes in how WI is handled, to allow for file attachments.
Then we could include some logic in the frontend part of Vector Storage to give the RAG treatment to the WI, too.

As for the query, I think it would be best to use the same query as for all other RAG sources (i.e. the latest chat messages), because the dynamically changing query is where the power of RAG comes from (i.e. it always looks for things relevant to the current context).

I like it when new features are implemented with minimal possible effort, but this seems a bit more elaborate, with a lot of moving parts going on behind the scenes. However, you could add a new metadata field to store the source of the RAGed part and use the same chat collection both for messages and all the connected lore (global, character, chat). Attaching files to WI is just an extra step - files are either way converted into text representation first, so you could theoretically add the ability to attach files by converting them to entry textual content first while maintaining perfect backward compatibility with older versions and other frontends. However, I don't know if that would be actually helpful as we're still far away from the land of the unicorns where you could just drop a wiki link and LLM would understand everything, but you're free to try it anyway.

Thanks for the input!

Yes, that is why I asked - sounds like this could require nontrivial changes. This is also not aligned with my current priorities, but on the other hand, sounds like it would have exciting potential for expanding the capabilities of the world info system.

It's also a logical extension of existing functionality. AFAIK, world info entries are similar to RAG, but using exact keyword search. This observation suggests two alternative ways to expand WI by RAG:

Semantic search: vectorize the WI entries into the RAG index
Auto-lorebook: upload a single file and let RAG ingest it

Stashing the info into the chat collection sounds a promising approach. But I'll have to take a closer look at the chat collection. So far I have no idea how anything except the actual messages are treated. I suppose I could start at tracing what Generate does.

(To avoid confusing Timelines, it might be best to at least refrain from injecting "messages" that are not actually messages.)

Yeah, tools like RAG aren't perfect. Good search terms make a huge difference, whether it's a keyword or semantic search.

So... I suppose I won't be doing this now, but this could be interesting later. As usual, no promises.

This issue has gone 6 months without an update. To keep the ticket open, please indicate that it is still relevant in a comment below. Otherwise it will be closed in 7 days.

This issue was automatically closed because it has been stalled for over 6 months with no activity.

SillyTavern / SillyTavern

Technologicat's TODO list #1739

1744

1743