mitmedialab / chat-tutor-embeddable

https://mitmedialab.github.io/chat-tutor-embeddable/
GNU General Public License v3.0
1 stars 0 forks source link

All messages should be written to local storage #4

Open pmalacho-mit opened 9 months ago

pmalacho-mit commented 9 months ago

A user's conversation should be written to local storage, so it can easily be read back in when the user revisits. The entries should be keyed by a username, so for now we can just use 'default' (but soon we should have username's associated with sessions).

Should a user have the option of whether or not to load their previous messages in or not? Or should they be loaded by default?

shengh318 commented 8 months ago

Questions

  1. Not really sure why the entries needed to be keyed by a user name when everything is stored locally? Is it perhaps for future expansions into the Dynamo DB Tables?

  2. Are we using a sql table for this?

Comment

IMO they should be loaded by default because from my past experiences with ChatGPT, it doesn't really mattered to me whether the conversation is from the past or a new conversation.

pmalacho-mit commented 8 months ago

Great q's!

  1. The thought around keying by username is so if two people use the same computer (without closing Chrome in between), they won't see each other's messages easily -- they of course are still available though, as anyone can go look in local storage. It's maybe a little overkill, so if it adds a lot of complexity to the design, let's not do it.
  2. No, local storage is a little funny, as it can only store strings, so each element will likely be an array of messages (ideally sorted by timestamp). Interacting with it might look like the following, but we probably want to implement this in a separate file, as we might want to add some more management onto it to maybe do things like only storing that last 10 or so messages, things like that.
    const username = "default";
    const storageKey = `chat_tutor_messages_${username}`
    const serializedMessages = localstorage.get(storageKey);
    if (!serializedMessages) return;
    const messages = JSON.Parse(serializedMessages).sort();

    Comment: Okay sounds good! Let's load by default

shengh318 commented 8 months ago

Maybe I am misunderstanding local storage. Are we storing their past chat strings in a table where that table will also be stored on their local computer? I was asking if we are using like a sql table for that or other file types?

pmalacho-mit commented 8 months ago

Hey @shengh318, ah gotcha. Ya, I think I should've linked this in the original task, but localstorage is actually a browser concept:

see: https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage

It behaves like a Map<string, string>, which informs how you have to interact with it (serializing data before you store it, and deserializing / parsing when you access it)

shengh318 commented 8 months ago

Ohhh I see. Thanks! And that just gets stored in the computer's browser like Chrome?

pmalacho-mit commented 8 months ago

Yep, exactly! So localstorage is a part of the browser / javascript spec (so all major browsers 'agree' to implement that functionality in this specific way)

shengh318 commented 8 months ago

Ok I see. So like the localstorage table would look something like this?

Username Chat
user1/00:01:01 "What is Python"
tutor_to_user1/00:01:02 "Python is ..."
user2/00.02:03 "Hello there"

The thing is if we want to load the previous conversations we should also save the tutor's response to those conversations. Also the cap for storage is around 5 - 10 MB per domain name so what do you think about removing the earliest conversation for that user if we are going to exceed the storage cap?

pmalacho-mit commented 8 months ago

Yes to all those things, but just with a few tweaks:

For the storage format, I think it should look more like an array of messages (and you're definitely right, it should include both user and assistant messages). so instead of keying by <user>/<timestamp> we just key by user, and then the entry is a string representation of an array of message objects (of the Message form -- lots of wasted space, but that's okay for now).

And we can just cap it to only hold onto the previous 20 or so messages (and can explicitly check the size of serialized strings: https://dev.to/rajnishkatharotiya/get-byte-size-of-the-string-in-javascript-20jm#:~:text=Javascript%20Useful%20Snippets%20%E2%80%94%20byteSize()&text=byteSize()%20snippet%20will%20take,size%20of%20a%20given%20string.)

So there should probably be a helper class, maybe called MessageStorage that accepts an array of messages and a username which controls serializing the last n (let's call it 20) entries and writing them to the appropriate place in local storage. This helper class also should have a function to request the previous messages for a user (which would then load the string entry from local storage, and do a JSON.parse to turn it back into an array of objects).

It's important to note that this is just a convenience feature. The server is what's really responsible for storing message history, so there might be a future where we actually explicitly request message history from the server, and don't rely much on locally stored messages.

With that^ in mind, I think the other tickets are higher priority, so maybe want to put this on the back burner for now?

shengh318 commented 8 months ago

Ahh ok sounds good!