kuvaus / LlamaGPTJ-chat

Simple chat program for LLaMa, GPT-J, and MPT models.
MIT License
216 stars 50 forks source link

Functionality to have "memory" from loading a chat log file #7

Closed MBCX closed 1 year ago

MBCX commented 1 year ago

Right now, every time you load the program and the model, its like starting a brand-new chat. Considering that it's possible to save the chat to a log file, it would be quite awesome to be able to load that log and continue the conversation from there. Kind of like how the ChatGPT API works. (You can pass a string of messages to the "history" parameter and the assistant would "remember" the previous conversation).

kuvaus commented 1 year ago

Yeah, that's a good idea!

I think I could try to do the simple version first, where the chat log from a file can be loaded in the prompt when you start. That way it would read what was said before. And then you can continue the conversation. I think it would work if the previous chat log was not too long. Say, 3000-4000 characters or so. Maybe something like --load_log /path/to/logfile.txt.

There's also a possibility to save the whole state of the neural network to disk. That would make some 2gb big file but if you load it it would resume exactly where you left it. I'm not sure how difficult that is to code so I haven't tried it at all yet. Maybe sometime in the future... but no promises in case it turns out to be too difficult.

But the simple version is something I think I can do so I'll try to put it in the next version.

Thanks for the suggestion!

kuvaus commented 1 year ago

With version v0.2.1 you can now load the previous chatlog. Also fixed a bug where the chatlog was not properly saved if you had specified a prompt from the command line. Whoops.

The loading is done with the simple way (its just loaded as text file and put into a string) so I hope it works. :)

MBCX commented 1 year ago

Whoa, that was quick! I'm going to try it out and let you know if it works! Cheers for adding the feature!

kuvaus commented 1 year ago

I'm assuming this works now. Reopen the issue or let me know if I messed up. :)

MBCX commented 1 year ago

Sorry for not commenting before! Yes, it does work. I am building a ChatGPT reverse-engineer interface (like ChatbotUI) and it's using your program as the back end, which is why I requested the functionality 😊. I am almost done before making it open source, just making sure everything is stable.