Stypox / dicio-android

Dicio assistant app for Android
GNU General Public License v3.0
761 stars 69 forks source link

Speech to text service, also available to other apps #109

Closed Stypox closed 1 year ago

Stypox commented 1 year ago

Speech to text service

This PR implements a Speech To Text service available to apps, fixing #54. Here is a preview of the feature, after pressing on the microphone button in Google Maps:

It is possible to also open the service from Dicio's navigation drawer, allowing the user to take dictation, copy to clipboard and share, fixing #33.

Testing APK

app-debug.zip

Technical details

This PR supersedes #100 by @nebkrid. #100 implemented the service as a skill, while this PR implements it as its own activity. The research done in #100 was really helpful though! I also kept the TODOs left behind there for later: for example, the result intent from the activity might contain multiple speech interpretations each with some different accuracy, and while Vosk does provide such information, it is currently not added to the result intent for simplicity.

Implemented export of Speech-To-Text functionality for other Apps, which can call this by startActivityForResult with an Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)

Extra RecognizerIntent.EXTRA_PROMPT is implemented

This PR includes #111, thanks to @nebkrid again :-)

  • prompt message shows up as hint (if none is provided, default is still "Say something...")
  • Auto finish preference setting added: Reason: Vosk is good, but at least in German it is not perfect. Therefore it is easier (and faster: avoid waiting for loading vosk model again) if user gets the possibility to confirm or speak anew before reporting the result back to requesting app.
  • added the TODO from #100 for optional (and seldom used, if ever) extras like EXTRA_BIASING_STRINGS, EXTRA_LANGUAGE for future reference and remind, which extras may be helpful for vosk recognition to improve the results

This PR also fixes a random crash when cleaning up Vosk, and sets the theme color used in e.g. button texts to a sensible value.

sudomain commented 1 year ago

Is there an example of starting this activity using am? I've tried many variations of the following, but to no avail:

$ am start -a RecognizerIntent.ACTION_RECOGNIZE_SPEECH -e RecognizerIntent.EXTRA_PROMPT test
Starting: Intent { act=RecognizerIntent.ACTION_RECOGNIZE_SPEECH (has extras) }
Error: Activity not started, unable to resolve Intent { act=RecognizerIntent.ACTION_RECOGNIZE_SPEECH flg=0x10000000 (has extras) }
RokeJulianLockhart commented 1 year ago

@sudomain, that's best asked at https://github.com/Stypox/dicio-android/discussions/new?category=q-a

nebkrid commented 1 year ago

@sudomain I have no experience with am, but guessing from Error: Activity not started, unable to resolve Intent { act=RecognizerIntent.ACTION_RECOGNIZE_SPEECH flg=0x10000000 (has extras) }: May you have to use directly the string "android.speech.action.RECOGNIZE_SPEECH" (like in the activity's manifest definition)? This is the actual value of RecognizerIntent.ACTION_RECOGNIZE_SPEECH