Stypox / dicio-android

Dicio assistant app for Android
GNU General Public License v3.0
789 stars 72 forks source link

[Feature Request]: Take dictation #33

Closed uhyf3 closed 1 year ago

uhyf3 commented 2 years ago

It would be good if Dicio could take dictation, composing and modifying text.

hobbycommandline commented 2 years ago

If you add this feature, consider the New Note intent https://developer.android.com/guide/components/intents-common#NewNote

Edit: I tried dispatching this action myself, it turns out that most notes app ignore these intents, including Google Keep, Markor, and Simple Notes (From Fdroid). I will file an issue with Simple Notes, I presume Google Keep wouldn't care about such things

https://github.com/SimpleMobileTools/Simple-Notes/issues/492

hobbycommandline commented 2 years ago

on further research, you can send plain text files to apps via that "open in" mechanism and much more apps support that, so even though its not the 'proper' way to do things, probably that'd be the recommended way

uhyf3 commented 2 years ago

If Simple Notes and Markor won't work, would Jota Text Editor work anyway? It has more features than Simple Notes, more simplicity than Markor, and has more ease-of-use and appears to be more stable than either. Google Keep defeats the purpose of being 100%-offline anyway. Jota Text Editor is quite lightweight, so incorporating it into dicio-android should not increase the footprint much. If the "storage" permission is a problem, it could be taken out and it's "Copy, Cut," & "Paste" features could carry the data, and along with the insertion point and selection, allow composition.

hobbycommandline commented 2 years ago

I have figured out how to do it in a way more apps recognize. You have to use Intent with ACTION_SEND with intent.setType("text/plain") and EXTRA_TEXT set to the message you want to send. This allows you not only to send it to apps like Google Keep, Simple Notes, and Markor, but also send it as a Tweet, or Discord Message. Unfortunately it does require you to use startActivity, which means the user will be forced to touch their phone and cannot complete the action solely through voice.

https://developer.android.com/reference/android/content/Intent#EXTRA_TEXT https://developer.android.com/reference/android/content/Intent#ACTION_SEND

hobbycommandline commented 2 years ago

I only implemented this in my own app, but you're welcome to use my code. unfortunately for you I wrote it in scheme https://github.com/hobbycommandline/Hobby-Scheme-Command-Line/blob/master/app/src/main/assets/scheme/actions/note.scm#L43 but if you get stuck at all, my action does work so you can use it as a reference. there is a carve out in my readme that says Dicio is able to use any code from mine even though its GPL instead of AGPL finish-SEND is the method that dispatches the proper intent

startActivity which it calls does set a flag, and quit the app, which can be seen here https://github.com/hobbycommandline/Hobby-Scheme-Command-Line/blob/master/app/src/main/java/org/hobby/dispatcher/IntentDispatcher.kt#L50

muonIT commented 2 years ago

In order to have the fallback of listening to the spoken words a combination of recognized text and audiorecording could be send to whatever app. My main interest would be to compose an email to myself with the text inline and audio as attachment but that's just my todo workflow ;-)

uhyf3 commented 2 years ago

muonIT, there are standard formats for captioned audio / "subtitles" (The subtitles don't have to be in the same file as the audio.), in case that helps. I was hoping for something to compose interactively (as you would with a keyboard) though, so the audio wouldn't be needed unless doing a whole audio file.

Stypox commented 2 years ago

It would be good if Dicio could take dictation, composing and modifying text.

Yeah, that would be a great addition. Thank you everyone for the information you collected, having the possibility to share text with (or talk directly to) the notes app or other apps should be considered.

uhyf3 commented 2 years ago

Might I suggest incorporating Jota Text Editor, a lightweight, easy to use, very stable and compatible FOSS text editor, to facilitate sending it text, & cursor control, selection, & editing commands it supports? (Currently, it supports them by gesture/taps & pushbutton.)

The Android keyboard interface may be another option.

My thanks also to everyone.

thebiblelover7 commented 2 years ago

@Stypox Allowing for Dicio to be used as voice input in an app such as org.tasks would be great! Which I think are related to this Feature Request

RokeJulianLockhart commented 2 years ago

As @uhyf3 stated, if you add the ability to provide textual input, please incorporate an editor that already exists so that you do not re-invent the wheel.

Stypox commented 2 years ago

I think two problems are being discussed here. Both would be welcome contributions, and if nobody does it before I reach that point, I would implement them, too.

thebiblelover7 commented 2 years ago

While I agree with @Stypox, I think both problems can be solved by point 1. Creating Notes would be easiest to be implemented with STT. I think every notes app would have to find a way to be involved with Dicio, for that to work. But again, just my thoughts.

Stypox commented 2 years ago

An update for this: I found out that Athena is able to configure itself as a "Voice input" app. I couldn't find documentation about how that could be done online, but now I can look into Athena to see how they did it.

uhyf3 commented 2 years ago

I think two problems are being discussed here. Both would be welcome contributions, and if nobody does it before I reach that point, I would implement them, too.

  • Use dicio as a Speech-To-Text app. This can be done by exposing Dicio as an STT to the system, so that in theory it can be used by e.g. keyboards.e

That would be great!

  • Creating a skill that supports dictation and creating notes. This should possibly be done also in tight coordination with the note taking app the user has as its default on its system. This way we don't have to reinvent the wheel (even copying over code from another app would partially be reinventing the wheel) and users will be happy since they would be able to use the notes app they like the best. I am not sure whether this is doable or not, as maybe there is no common interface for note apps.

Especially if the system STT thing doesn't work out, "copying over code from another app [e.g. Jota Text Editor] would" be a good way to get standardized editing and saving features - notes apps often can't save on the filesystem, and rarely if ever can save in even 1 standard plain text format.

mrjpaxton commented 2 years ago

Another +1 for this.

I haven't seen any open source voice assistant on Android that have actually worked for me besides Dicio, and with offline Vosk no less! I'm really hoping these important features get ported!

But yes, sounds like getting this implemented to support the voice assistant button in keyboards is a good idea, too.

My question is, how would errors in dictation be handled? Example being that the STT system is not perfect, so if you wanted to remove the last word(s), navigate back in the sentence by words, or add special punctuation, how would that be done? I know, sorry... that sounds like a long term goal more than this. But it's unfortunately what most people used to VAs expect.

KeronCyst commented 2 years ago

My question is, how would errors in dictation be handled? Example being that the STT system is not perfect, so if you wanted to remove the last word(s), navigate back in the sentence by words, or add special punctuation, how would that be done?

You stop the dictation and manually edit visually on the touchscreen/by keyboard like any other app, right?

Anyways, another strong +1 here. I was shocked that it could listen to all of my words for all of these advanced commands but doesn't even have the most basic feature of transcribing my speech/saving what it captures instead of just deleting it every time. That's actually the only reason I'd use it; I don't care about the other features at all.

Stypox commented 1 year ago

Would you mind testing #109? Does it satisfy your needs?

muonIT commented 1 year ago

@Stypox Just did a quick test. The dictation, called from the settings menu in dicio, works very good and the share feature puts the recognized text right in the email message body - so very close to what I intended! Thank you very much for your efforts! They are much appreciated!! :smiley: :+1:

cannycartographer commented 1 year ago

Just to say I'd love to see something like this - would make Dicio even more useful! Currently I'm finding that this is not fixed by the pull request above. The only place I seem to be able to open Dicio as a navigation drawer and therefore am able to copy the text is in Firefox through the voice input option (and this also places the text straight into the search bar, creating a privacy risk). The notes app I use does not seem to have a voice input option, nor is my keyboard (gboard) recognising Dicio for voice input purposes. And, in line with other issues referenced by others, I don't seem to be able to set Dicio as a general purpose voice input. I wonder if a standalone skill for this might make sense as a workaround for users having issues with other parts of system integration. It could be kept minimal - it could be a 'copy to clipboard' skill and a 'share text to app' skill. Thanks for all your work on the app!

RokeJulianLockhart commented 1 year ago

It definitely doesn't register as a voice input provider.

Screenshot_20230422-003019

I simply want to be able to select text and have Dicio dictate that like Apple devices can with Siri and use Dicio to type at a text box.

uhyf3 commented 1 year ago

I agree. The "composing and modifying" is needed before the text is sent to the other app. Perhaps there are exceptions, but usually, for security & privacy, it would be best to compose/modify before sending. More ways to send the text would be good too. (I particularly like the "copy text to clipboard," "save to plain text file," and 'save audio captions / "subtitles"' possible features.)

forteller commented 1 year ago

I see that this is closed as fixed, but I don't see the feature mentioned anywhere within the app? How do I use this? Thanks!

Stypox commented 1 year ago

@forteller you can press on the "speech to text service" button in the drawer

forteller commented 1 year ago

Ah, I see. Thank you! Shouldn't there be a trigger word for this, though? :)