Closed Mark-Phillipson closed 1 month ago
I think tts could be another destination-wdyt, @C-Loftus ? You could say "model prompt spoken" and have it reply with tts.
Thanks for your comment
With regards to the point on conversion, yes that is something I would like to support. Currently all requests are stateless. If we want to allow state we need:
A clear understanding of what state should be passed in given the fact we may have previously manipulated an editable text field. This is non-trivial and not something that a normal chatbot doesn't need to work about. Is the source just the selected text? Or is it what the model returned previously? Should selected text be removed if it was pasted from the model previously? All of these things are things we need to consider.
@Mark-Phillipson with regards to TTS, have you used any of the commands in the TTS folder? (i.e. do you have https://github.com/C-Loftus/sight-free-talon installed for TTS?) I am curious on your user experience and if you find it useful. I also think that tts is very useful and makes it so you don't need to clutter the screen or paste anything. However, installing tts is a pain at the moment since Talon doesn't have a package manager.
insertionDestination
. So the grammar would end up being something like model fix grammar echoed
or model fix grammar spoken
. So we keep the point free / pipeline pattern. Yes I had to install the tts manually which was not straight forward.
Sometimes whilst working I'm trying to remember the name of for example a CSS property and just need a quick reminder and don't want to lose my place or focus. So I thought it would be cool to have a text to speech feature that I can trigger with a voice command.
Another example when I'm reading an article on a website I can say computer define whatever the word is that I need a definition for without losing my place.
Yes I had to install the tts manually which was not straight forward.
When you say manually do you mean you copied and pasted just the text to speech code from sight-free-talon and didn't clone that repository, or rather that you did clone sight-free-talon but didn't find it intuitive to use?
Do you have any suggestions on how I could make it more intuitive?
Sometimes whilst working I'm trying to remember the name of for example a CSS property and just need a quick reminder and don't want to lose my place or focus. So I thought it would be cool to have a text to speech feature that I can trigger with a voice command.
Yup I agree!
I did clone the site free talon repository but as I only did it once I can't really remember much about it suffice to say it is working and I can remember having to change the speed of the voice to be able to understand it.
Should be implemented now. You can do the following to have a conversation with the model which can be optionally verbal via TTS if you would like
model start thread
-> which will store your conversation in a new threadmodel toggle window
-> to open the window which shows thread visualizationsmodel please tell me X
) will be auto added to the window after it returns its resultto speech
as your model destination, it will speak the output and if you have the thread enabled, it will continue to update the window.
If you feel this is missing behavior, please file a new issue so we can iterate
Would be nice to be able to have a conversation with the model with the responses back in voice.
For example "what is the capital of France?" and the model would reply with "Paris".
Talon file script example:
This is a simple example of how to use the model to reply to a question in voice. The model will reply with a summary of the text.
It would also be beneficial to have a feature that allows us to add each interaction to a list for the model to record. Additionally, we would require the functionality to clear this list when necessary.
For example "what is the capital of France?" and the model were reply it with "Paris". Then the user asked "what about Germany?" The model would reply with "Berlin".
This is a simple example of how to use the model to reply to a list of questions in voice. The model will reply with a summary of the list. Note the add to list functionality would have to be implemented.
Also it's a bit of a stretch to have the text to speech here as not everybody would have it installed so not sure how to get around that.