Stypox / dicio-android

Dicio assistant app for Android
GNU General Public License v3.0
761 stars 69 forks source link

System wide STT service #161

Open nebkrid opened 1 year ago

nebkrid commented 1 year ago

This is still in development, but the pull request is already opened for testing and reviewing purpose.

nebkrid commented 1 year ago

Before merging this we will have to make sure users understand the difference between these things

Yes, would be the readme file a good place for it? Additionally a button in the dicio settings as a shortcut to android settings menu for stt would be useful. But I am not sure whether it is in all android version the same place. In Android 10 it seems to be within the assistent settings, but on Android 13 I couldn't find the settings menu at all (or didn't showed up still an open TODO - see below). The Intent (= drawer = non-silent) way of requesting speech input should be intuitive as it shows up when dicio is first time installed like other standard apps.

... separate gradle module... The app project will then depend directly on vosk-stt-service, and so will still be provided without the need to install two APKs. What do you think about this? Would it be too complicated to do at this point?

I didn't looked into Sapphire yet, and therefore I am not sure whether it is the same what I guess it is. If it is something like a plugin so that the apps are related with each other this sounds generally like a very good approach. However, I just noticed when I was looking for some documentation about vosk in order to enable more RecognizerIntent extras that there is acutally a stand-alone stt-service project by the vosk-developers. In the first moment I doubted whether it is actually useful at all to spend more time than necessary in this branch. However, since the other project is not easily available (neither in f-droid nor play store and the latest apk release fails installing) and it seems that it will at least take its time until it will be, I think at least for the moment its definitly useful to let dicio export its vosk implementation. Especially with having the initialization speed up for dicio in mind. But spending too much effort in "reinventing the wheel" and make a completey standalone app doesn't make sense any more in my eyes. At least as long as the further development of the vosk-stand-alone-app is not given up (or other features missing, I didn't tried it yet). How do you think?

nebkrid commented 1 year ago

Current state of this PR: Implemented Features in the STT service:

Extended dicio features

Limitations

lman0 commented 1 year ago

@nebkrid could you provide an apk ? I would like to test your PR as a user. Thanks

cvzi commented 1 year ago

@lman0 I just compiled it to test it, here is an apk: app-debug.zip


  • reinstallation via android studio resets system setting to default STT service (don't know whether this will happen in case of normal updates, too. May also be device/android version specific. Seems to happen when the dicio process is killed by user / an update.)

It persists on regular app updates, as long as you don't change the names of the pertaining classes.

lman0 commented 1 year ago

Here is my use result : @nebkrid

It don't show as voice ime for other keyboard (tested with openboard , florisboard , each found in fdroid and with aosp keyboard as well) is that normal? I need to tell , I use French language. And since I use an aosp android 12 , I don't have google play voice recognition. Only dicio show as voice recognition, in the setting but I don't know how to use it.

The microphone remain activated all the time . Unless i'm wrong stt should not takeover the microphone all the time but only when it called by apps (like dicio, keyboard , ...),right?

nebkrid commented 1 year ago

@cvzi thank young for compiling and your hints and answers!

@lman0 thank you for testing!

don't show as voice ime

What this PR implements is not explicitly an IME but registering as an speech recognition provider which than can be used by all other apps (not just edit text fields) and also e.g. from IMEs. At least florisboard (linkToCode) searchs explicitly for an IME which supports voice (and I guess the others are doing it the same way). Can be discussed to support explicitly IME requests, too, but I think it would be better (and easier) for a keyboard app to request the system STT service (which then can be set to vosk/dicio) than a speech assistant app implementing all the code and requirements to serve as a keyboard.

Unless i'm wrong stt should not takeover the microphone all the time but only when it called by apps (like dicio, keyboard , ...),right?

Yes, you are right. The microphone should not be occupied all the time. Though, it does not happen on my device, I have an idea what it might be. May you can check how this apk behaves?

lman0 commented 1 year ago

I think that ,since even the aosp keyboard search for an voice ime, all keyboard will share the same trends and seek an ime voice recognition. So it's better to create a be able to respond to such requeest since it basicaly the 'norm' way. Otherwise iy will make the stt service , mostly unusable by keyboard. And beside keyboard, there is to my knowledge , very few apps that use directly the sst service .

If you know some (except dicio) , i would be inteested to know some app name.

About tbe new app you have given :

First , I maybe said a word too strong, aka 'takeover' . to be precise the microphone notifications is always shown Since dico have been put as stt service but it still possible to use audio recorder (the audio recorder take the microphone just fine, an record properly audio)

It 's The same , the micro notification remain shown all the time, until dicio is removed and the device rebooted.

Is it the same with google?

lman0 commented 1 year ago

There is kõnele app ,found on fdroid, that do ime backed by kõnele-service that call an online SST. Maybe you could see how it work

lman0 commented 1 year ago

@nebkrid if kõnele app is installed , and dicio is the only one selected as stt service inside kõnele setting , then it seem to work when calling the microphone button inside other ime (kõnele IME must be activated in android setting) .

Interestingly the microphone notifications stop showing when kõnele app stop the call of the stt service. (But the recognition still work , if a speech to text is done again)

Otherwise, if I try ,inside kõnele setting, to call dicio setting it say that dicio block the intent. (It 's about a permission denial , when clicking on dicio inside recognition service setting of the kõnele ime app). Maybe something to improve.

The combo kõnele app with dicio stt, allow to have microphone button usable with other ime.

lman0 commented 1 year ago

PS , if you search ,inside fdroid app, the kõnele app you must type kõnele , with the õ otherwise you will not found it

nebkrid commented 1 year ago

... So it's better to create a be able to respond to such request since it basicaly the 'norm' way. ... ...The combo kõnele app with dicio stt, allow to have microphone button usable with other ime....

In case this PR will become a standalone app as dicussed above I do agree that this would be better for compatibility. However, for the moment especially with konele as a working IME with STT (thank you for pointing to this app!) I would prefer to keep concentrated onto the STT. Additionally I still do believe it would be an even more userfriendly option for other keyboard apps to directly query the STT service, as this does not require to change the keyboard UI. That this is not done yet is probably caused that there are actually no other speech recognition services easily available than the google one, so that the STT service part is not well known or at least no benefit on first sight. But it might be an idea to start such a feature request issue in these apps.

And beside keyboard, there is to my knowledge , very few apps that use directly the sst service .

Most apps use the intent approach which is already implemented in dicio, as this does not need microphone permission. But when I searched once in the play store for STT service apps I saw some dictation apps which at the end used the google background STT service. And since this function is generally available in android, I think more apps will come with time. Personally, I am interested in STT for automation (like in #154) .

First , I maybe said a word too strong, aka 'takeover' . to be precise the microphone notifications is always shown Since dico have been put as stt service but it still possible to use audio recorder (the audio recorder take the microphone just fine, an record properly audio)

You still could try this app-debug.zip which shows two types of Toast messages, when the microphone should be released to make sure that the methods are called on your device. But if the toasts are showing and since it seems only your device has this error (How is it acutally with the konele app as STT with its only recognition service on your device?), I don't know what could be the reason.

lman0 commented 1 year ago

Kõnele stt-service don't have this notifications always on.

Even if I kill dicio the notification remain.

But if use kõnele app/ime to call dicio stt service then the notification disappear once I do speech recognition and stop the kõnele listening.

lman0 commented 1 year ago

@nebkrid there is a bug with dicio, if I disable the WiFi/4g (aka offline), dicio can't evaluate the stt content that come from the sst (instead of the internal vosk)and show an 'network error'. But kõnele with dicio stt work. Dicio with internal vosk work with no error. Even with all skill related to network disabled

nebkrid commented 1 year ago

if I disable the WiFi/4g (aka offline), dicio can't evaluate the stt content that come from the sst (instead of the internal vosk)and show an 'network error'

Please doublecheck which STT service is set as default one in your system settings. Killing the dicio process causes setting this back to a different one. The network error means that a different STT than the dicio one is used, since the dicio STT never returns network error. (I guess in your setup konele STT via network is requested)

But if use kõnele app/ime to call dicio stt service then the notification disappear once I do speech recognition and stop the kõnele listening.

The both toasts "stop recognizer" and "shutdown" are showing up? Please try additionally the following: 1) in dicio setting -> input and output -> input method -> choose Sytem provided STT service 2) force killing the dicio process in the device settings screen for dicio (the one where you also could deinstall the app) so that microphone notification is gone. 3) check that dicio is set as default STT service in system settings 4) open dicio app and look whether microphone notification error is gone

lman0 commented 1 year ago

You are right with the bug: Indeed since I had installed kõnele before dicio, when the dicio speech was stopped , it return to default to kõnele for stt.

When I desinstalled both dicio and kõnele rebooted Then installed dicio Rebooted Then installed kõnele . And made sure it's dicio on both (stt and ime)

Then when I selected the SST source inside dicio as android stt (and closed to make sure it use android stt). When offline dicio worked correctly.

It was tricky.

@nebkrid it seem that the reason kõnele split in stt and ime was to not be impacted if the app was stopped.

By the way, if I use internal vosk instead of external stt, the toast still show that the speeche recognizer was stopped then shutdown. Is that normal?

lman0 commented 1 year ago

@nebkrid Both toast show up when kõnele use dicio stt and stop listening. And it's not a notifications 'error' , it show near battery icon , a microphone icon that show an app use the microphone . It disturbing because it feel like dicio listen all the time even though the stt should not. The icon stop showing when kõnele do the call and dicio was stopped. The icon still show if dicio is started then stopped.

I think it may be linked at the fact that dicio start automatically to listen (use microphone) when started
But it may not say to the system that dicio no longer use the microphone when stopping the speech recognition either internal or not . I wonder if this would still be the same if dicio stt were to be splited in another package (with another package name)

nebkrid commented 1 year ago

By the way, if I use internal vosk instead of external stt, the toast still show that the speeche recognizer was stopped then shutdown. Is that normal?

The Regnizer stop yes, the shutdown was something I only made for your issue with the microphone keeps showing and is not necessary for me to disable the system microphone symbol. Since they show up, everything regarding the microphone is released. When set to internal vosk, the process is kept running (with released microphone but loaded speech model) in order to improve loading speed, whereas when dicio set to Android STT it is not specified whether it will keep running and at least on my device it is stopped very soon after dicio is closed. For your tests it is important when changing from internal vosk to Android STT to make sure that dicio process is killed in the settings (not just the UI "swiped away" because otherwise technically there is no difference to internal vosk, if it was started once with it (which always happens when it is freshly installed). grafik

(This must be gray after step 2 above. Otherwise the dicio STT process is still running in background without UI and reused as soon as UI loaded again.)

lman0 commented 1 year ago

@nekrid

When i use the audio recorder, microphone stop showing once i stop the recording. But in dicio case , it's not stopping showing. I think that the microphone is not released or stoped once dicio app have finished the record of audio necessary for speech recognition.

https://stackoverflow.com/questions/14252400/how-to-stop-recording-in-android

By the way, when dicio is closed within app info , with vosk internal set before And I restart dicio The toasts remain when using vosk internal.

As expected, closing dicio , deselect dicio setting from stt selection , but that is already know

nebkrid commented 1 year ago

When i use the audio recorder, microphone stop showing once i stop the recording. But in dicio case , it's not stopping showing.

I understood your issue and agree that this is annoying for you that it still seems to listen. However, if the test with the 4 steps described above does not help, I have no idea left what causes the microphone symbol keeps showing. Technically with this test there should be no difference left compared when konele finishs the dicio STT service usage compared when dicio uses it. And the toasts confirms that the methods for releasing and stopping the microphone are called. Therefore, I am very sorry, but I have really no idea left what is different on your phone. (I tested it on three devices, and neither cvzi nor stypox mentioned this yet). Maybe a log output would help if it shows any specific errors, but I don't know whether this is possible from a compiled apk (@Stypox is this possible?)

lman0 commented 1 year ago

@nebkrid I understand ,it'ok, since it seem iam the only one that have this situation , and it ' s somewhat cosmetic for some .

It's better to have other problem resolved first.

And the more problematic , is the reset of the selection of dicio. This problem don't occurs if dicio is not stopped.

For instance, When I start my phone, without starting dicio And I use kõnele to use dicio , it work without any problem. And if check app info of dicio , I discovered that dicio have been started silently (there is no GUI in the list of app). More over , in my case, the microphone icon is not here even though dicio have been started silently.

I found that konele itself, like dicio, have also an internal/offline stt. Maybe checking within konele source code would help since it seem not have dicio problems of reset.

nebkrid commented 1 year ago

And the more problematic , is the reset of the selection of dicio. This problem don't occurs if dicio is not stopped.

But is this actually a real problem in daily use? To me it seems that this is only a problem while developing (because of the reinstalls).

And if check app info of dicio , I discovered that dicio have been started silently (there is no GUI in the list of app).

Yes, this is because you started the background service when using konele with dicio. It seems that you are technically interested in this whole system and also tried a lot of comparing. Because you once wrote "test as a user", I don't know whether you have programming skills, but even without any I think it would be interesting for you, if you read some stuff about the android system and how apps interact. Just search for things like "android developer xxx" and you are pointed to the well explaining android developing resources. (e.g. this one, just don't be confused from the referred classes. Keep on reading, follow the linked classes and read their introductions and with time it will make sense ;) ).

lman0 commented 1 year ago

But is this actually a real problem in daily use? To me it seems that this is only a problem while developing (because of the reinstalls).

On daily use , in my case, I try to swipe up recent app that I don't use, and since the green icon is always up I try to close dicio whenever i can. And when I update dicio, I don't want to have to check to make dicio the default again...

I said 'test as user ', because people that usually respond to such pull are Dev, so I wanted to express that I'm not.

And I compared with konele because it the app I use everyday and since I find your pull interesting.

In tried to express what I found and what I searched in their issue . So you could check with them or in their source code, how they manage to do it well when it seem here it's still problematic.

hexedsilicon commented 1 year ago

What's the hold up on getting this merged?

nebkrid commented 1 year ago

@hexedsilicon sorry for my late reply. Haven't been on github for a while. I am not quite sure. I stopped working on this when it was working from my perspective and the possibilites (devices) I have to test. I guess (but don't know it) @Stypox want to have seperate modules for it, as he suggested in one comment above. I long wanted look into this but didn't had time / motivation yet due to the device dependent issues discussed above. Did you tried it out and can confirm that it is working on other devices than my one, too? From my perspective it is working and I am using it regularly. Therefore it would be indeed great to have it implemented in the main branch, since even if this feature is not perfectly working on some devices, it does not change the behaviour of the existing app, if it is not activated. (At least it didn't based on the branch from which I created the PR.)

Stypox commented 2 months ago

Sorry for taking so long to actually take a look at this PR. In the meantime I did a complete refactor, so this PR is not applicable to the current code anymore. However, now the SttInputDevice is much simpler to interact with, so recreating this PR was simple: #227. Thank you everyone for your efforts! I will keep this PR open as a reminder to implement the other things this PR provides, namely the possibility to use the system STT service in Dicio, and the sound played when starting to listen. @nebkrid could you review #227?

nebkrid commented 2 months ago

Great news that you now want to include this feature. I was actually using two apps in parallel within the last year - one from this PR as a stt service provider, and as a second one your current main branch with updated and new features. Since I am not used to Kotlin (and it is quite a while ago, so I don't know the detailed implentation requirements any more) I didn't reviewed your code in detail, but I tested your sample app from #227 with the app in which I regularly use the stt feature (on Android 13 device), and it is working! Thank you!