Open cmroanirgo opened 6 months ago
this would improve the app by a ton! please consider adding this! it's quite annoying having to wait for the processing and not being able to continue speaking - as OP said, processing should be moved to the background and you should be allowed to continue speaking while the previous chunk is being processed.
thanks a ton for this awesome app!
In a another thread's comment I mentioned wanting to dictate meetings for 1-2 hours. It would be nice if it could chop up audio in chunks and do processing while still recording so that I could do this.
In a another thread's comment I mentioned wanting to dictate meetings for 1-2 hours. It would be nice if it could chop up audio in chunks and do processing while still recording so that I could do this.
100% would LOVE too be able to use FUTO voice offline for 2hr zoom calls.
Given I can't upload an mp3 and transcribe it after the fact, If we could do it live and in the background would be perfect.
Imagine transcribing live as a keyboard to a google doc and being able to edit live.
I'm currently using the medium English language model to enter words, and in general it works very well. Sometimes however I wish to add a longer brain dump: longer than a few sentences, & ultimately much longer than the 30 second limit.
I grok why 30 secs are needed.
Currently the input waits for silence, processes it and then closes the input. Is there a way for the processing to be moved to the background while awaiting the next chunk of voice? That is, it's be awesome to read out a sentence & at a pause the model begins processing, but stays open for me to keep saying the next sentence?
I know that I can always close the voice input by clicking on the X button, so terminating it is not a problem.
Currently, when the option "Automatically stop on silence" is OFF, it waits for manual close and then all input is processed. This is a suggestion to change this option to a rolling /continuous system to allow long form input. That is, at detection of a pause, stay open but process the existing audio.