'hands-free' mode - Githubissues

hennas-waifson commented 5 months ago

Hello! I have been using the assistant for almost a month now, and it has been great! What sets Elephie apart is its ability to comprehend and a memory that actually works well.

For an even deeper integration into one's workflow, it would be nice to have 'a hands-free mode' that would allow one to use the assistant without interrupting the current task = being able to stay on whatever window they are working in and communicate with the assistant.

For example, while working in a word document, pressing a key combo and asking the assistant a question via a voice. User while typing the paper: "what are the major schools of thoughts in economics?" Assistant: "There are a few major schools of thought in economics. Classical Economics - This school..." User has an option to continue same convo by repeating the key combo and speaking.

Asking the assistant about the contents of the clip-board. For example, the user ctrl-copies an email then asks the assistant as above with a clipboard-function-specific key combo 'Draft a response to this email'. Assistant responds with a draft. This way assistant can be asked about anything that can be copied into the clip-board, such as a news article, a web page, a text to proofread and keep it as part of a convo with the user.

Saving the conversation from any window by pressing a key combo.

Maybe saving the contents of the clipboard into notes, using a key combo, without having to go to the assistant's window.

The above will effectively make the assistant universally accessible throughout the operating system. Regardless of what window the user is working in, they will be able to take advantage of the assistant's functionality.

hennas-waifson commented 2 months ago

I added a number of tools to the bot -

web search
web scrape
combo of the above
access to clipboard
image view
online image view
terminal
notepad (a record of thoughts)
tts
stt
changed the prompt
edits: pdf scrape
open and view a specific conversation in full, after the usual memory search

running on a local model, the app does an excellent job using tool combos. Attached is a log where I ask the assistant for a spooky story; the AI looks up online what makes a good spooky story to write a good one for me; uses terminal to produce a spooky sound; tells me the story and uses webcam to check my reaction. I don't have coding experience, so the code is most likely a mess, but after countless hours of trial and error, it works. We can talk about this more, I wish there was a way to email you.

log_20240811_173242.txt

hennas-waifson commented 2 months ago

I implemented the tool, where three models (two debaters and one judge) hash it out to arrive at answers to 'more complicated' problems. Here is their debate on which is heavier, one pound of steel or one kg of feathers. Surprisingly, they arrive at the right conclusion.

log_20240919_204518 copy.txt

v2rockets commented 3 weeks ago

thanks, I'll check carefull when having time

v2rockets commented 1 week ago

I added a number of tools to the bot -

web search

web scrape

combo of the above

access to clipboard

image view

online image view

terminal

notepad (a record of thoughts)

tts

stt

changed the prompt

edits: pdf scrape

open and view a specific conversation in full, after the usual memory search

running on a local model, the app does an excellent job using tool combos. Attached is a log where I ask the assistant for a spooky story; the AI looks up online what makes a good spooky story to write a good one for me; uses terminal to produce a spooky sound; tells me the story and uses webcam to check my reaction. I don't have coding experience, so the code is most likely a mess, but after countless hours of trial and error, it works. We can talk about this more, I wish there was a way to email you.

log_20240811_173242.txt

Hi. Sorry for the late reply. I can see that your custom configuration work well for you case. As for your set of tools, the sheer of variety looks impressive... At current stage, I would be interested in web search/web scraper and tts/stt. Would you like to share some general idea how did you make it? What kind of tools did you use or simply code by yourself (or GPT)?

hennas-waifson commented 1 week ago

No worries. I did a lot of experimenting and have literally hundreds of versions of Loyal Elephie folder with different features. I got a little bit obsessed with it. I don't know (yet) how to use GitHub, but I will clean up one of those from my personal info and publish it as a branch.

My most recent experiment included creating what I call a "flexi prompt" - a prompt that adjusts to the current conversation. I stumbled upon this by accident and wasn't sure about it to begin with, but it works surprisingly well. It asks the model to 'adjust' the prompt based on user input and the convo before each generation. The only section that doesn't change is tools. It results in more varied responses that at the same time are more focused. And it makes total sense, as the top of the prompt is very important. It is hard to go to 'regular' prompting after that.

And another thing is a system that is checking for proper tool use; basically, every generation is checked with a small model and edited if needed to 'fix' any tag issues. It ads just a second or two of 'waiting' but seems to extend workable max context.

v2rockets commented 1 week ago

Great. Just for update, I'm think about how to implement something similar to your "ongoing concerns" in an elegant way, and I'm going to test some new embedding techniques to see if the recall of relevant notes can be made better.

hennas-waifson commented 1 week ago

Thank you! I think I probably should share with you what I have. I implemented a persistent 'notebook' with giving the model the tool to edit it using 'search' and 'replace' blocks. It worked, you may want to see it. I left some more complex features like this behind, so to say, so that I can use AI to code and try new things - and it helps to have a very small file to work with.

v2rockets / Loyal-Elephie

'hands-free' mode #14