SillyTavern / SillyTavern-Extras

Extensions API for SillyTavern.
GNU Affero General Public License v3.0
549 stars 124 forks source link

sorts messages in ChromaDB export by 'date' order #60

Closed johnbenac closed 1 year ago

johnbenac commented 1 year ago

Added a line that sorts the exported chromaDB export by the order in which the conversation happened

This is my first pull request, on any project, ever. Apologies for any mistakes.

From the Discord:



When it exports the ChromaDB from the SmartContext in extras, it would be nice if the exported file had the messages in the order in which they were made. Right now, I'm getting this order for a chromadb that I exported:

msg-0, msg-21, msg-2, msg-3, msg-4, msg-5, msg-6, msg-1, msg-8, msg-9, msg-10, msg-11, msg-12, msg-13, msg-14, msg-15, msg-16, msg-17, msg-18, msg-19, msg-20, msg-26, msg-22, msg-23, msg-24, msg-25, msg-7

I dont know if that's determined by the ChromaDB codebase, or if it could just be a small tweak to the SillyTavern codebase. 

I'm wanting to experiment with transforming (with AI) chromadb databased and sideloading them into conversations with different characters to created a "shared universe" across group chats. Having a slightly more interpretable chromaDB export file would help. In the meantime, I may (have the AI) write a little script to reorder them chronologically, but if you all could make a small tweak to make such a script unnecessary, that would be nice.
JoJorge — Yesterday at 11:34 PM
Here is the script. It works, and the output chromaDB looks a little more like a transcript. 

Even if the wonky order is part of the ChromaDB code, you might use somethign like this to process the exporting file before sending it to the user.

import json
import sys

def sort_chat_data(file_path):
    with open(file_path, 'r') as file:
        data = json.load(file)
        content = data['content']
        sorted_content = sorted(content, key=lambda x: x['metadata']['date'])

    new_data = data.copy()
    new_data['content'] = sorted_content

    new_file_path = file_path.replace('.json', '-sorted.json')
    with open(new_file_path, 'w') as sorted_file:
        json.dump(new_data, sorted_file, indent=2)

if __name__ == "__main__":
    if len(sys.argv) != 2:
        raise ValueError("Please provide the JSON file path as the only argument.")
    sort_chat_data(sys.argv[1])
Attachment file type: code
[sort_chromadb_json.py](https://cdn.discordapp.com/attachments/1101440561442992229/1121282160196866048/sort_chromadb_json.py)
660 bytes
RossAscends (SillyTavern UI Dev) — Yesterday at 11:50 PM
drop into PR
```end of discord chat