ankidroid / Anki-Android

AnkiDroid: Anki flashcards on Android. Your secret trick to achieve superhuman information retention.
GNU General Public License v3.0
8.72k stars 2.24k forks source link

[Feature] Web Speech API in Javascript #8794

Closed mikunimaru closed 3 years ago

mikunimaru commented 3 years ago

As a solution to all the functional requirements related to TTS, I would like to propose support for the Web Speech API in JavaScript.

By supporting the Web Speech API, it will be possible to realize the speech behavior of the cloze-only tag, place the speech button in the card, and speak the text multiple times at different speeds using JavaScript on the user side.

This has the advantage of keeping the application side simple by leaving the implementation of overly complex functions related to TTS to the user's JavaScript side.

Below is the sample code.

<p><select id="voice"></select></p>
<p><textarea id="textarea">This is TTS test.</textarea></p>
<p><button id="button1">Speak</button>
  <button id="button2">Stop</button></p>

<script> 
// CC0 http://creativecommons.org/publicdomain/zero/1.0/
if (window.speechSynthesis) {
  let voices = [];
  function setVoices() {
    if (voices.length) return;
    voices = speechSynthesis.getVoices();
    if (!voices.length) return;
    voices
      .filter(v => v.lang.startsWith("en"))
      .forEach(v => {
        let opt = document.createElement("option");
        opt.text = v.name;
        opt.voice = v;
        voice.appendChild(opt);
      });
  }
  speechSynthesis.addEventListener("voiceschanged", setVoices);
  setVoices();
}
button1.addEventListener("click", () => {
  let opt = voice.selectedOptions;
  if (!opt.length) return;
  let u = new SpeechSynthesisUtterance(textarea.value);
  u.voice = opt[0].voice;
  u.lang  = u.voice.lang;
  u.addEventListener("boundary", e => {
    if (e.name != "word") return;
    textarea.focus();
    textarea.setSelectionRange(e.charIndex, e.charIndex + e.charLength);
  });
  u.addEventListener("end", () => textarea.setSelectionRange(0, 0));
  speechSynthesis.speak(u);
});
button2.addEventListener("click", () => {
  if (!window.speechSynthesis) return;
  speechSynthesis.cancel();
});
 </script>

I've also prepared a page where you can easily try out the code. https://codepen.io/mikunimaru/pen/poevqKq The above code is executable in my android browser, so I think android is capable of supporting Web Speech API.

welcome[bot] commented 3 years ago

Hello! 👋 Thanks for logging this issue. Please remember we are all volunteers here, so some patience may be required before we can get to the issue. Also remember that the fastest way to get resolution on an issue is to propose a change directly, https://github.com/ankidroid/Anki-Android/wiki/Contributing

krmanik commented 3 years ago

Have you tried this one https://github.com/ankidroid/Anki-Android/wiki/FAQ#to-use-tts-on-ankidesktop-and-ankidroid ? It is working for cloze also.

SpeechSynthesisUtterance is not supported in android webview. So, above code will not work inside AnkiDroid reviewer. May be JAVA implementation will be done to create TTS. Also after rust conversion the TTS will be improved in latest release of AnkiDroid. https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance

mikunimaru commented 3 years ago

Have you tried this one https://github.com/ankidroid/Anki-Android/wiki/FAQ#to-use-tts-on-ankidesktop-and-ankidroid ? It is working for cloze also.

SpeechSynthesisUtterance is not supported in android webview. So, above code will not work inside AnkiDroid reviewer. May be JAVA implementation will be done to create TTS. Also after rust conversion the TTS will be improved in latest release of AnkiDroid. https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesisUtterance

In the current app specification, there is no way to achieve the behavior of speaking only the opened part of the text when cloze is opened.

Is it possible to use Chrome Custum Tab instead of WebView to enable rich JavaScript? https://developer.chrome.com/docs/android/custom-tabs/overview/

The availability of powerful JavaScript provides great extensibility to AnkiDroid, which has no add-ons.

krmanik commented 3 years ago

In the current app specification, there is no way to achieve the behavior of speaking only the opened part of the text when cloze is opened.

May be in AnkiDroid 2.15 or 2.16 the TTS will improved greatly even for cloze also.

Is it possible to use Chrome Custum Tab instead of WebView to enable rich JavaScript? https://developer.chrome.com/docs/android/custom-tabs/overview/

I don't think so but you can submit PR.

The availability of powerful JavaScript provides great extensibility to AnkiDroid, which has no add-ons.

The js addons support will be available for AnkiDroid soon.

mikunimaru commented 3 years ago

I've read the commentary and understood that the Web Speech API is difficult to use. Then what about an alternative: providing an api that can freely call AnkiDroid's TTS from javascript in the template.

This api provides a lot of custom freedom with TTS functionality.

mikunimaru commented 3 years ago

I tried to add TTS API for JS by a simple modification. https://github.com/mikunimaru/Anki-Android/commit/b04f7cd4415923d8ce7b2e939d717607f738fb0d

https://user-images.githubusercontent.com/43168745/117616402-0b777e00-b1a6-11eb-9e10-6126f3b2e937.mp4

<p><textarea id="textarea">This is TTS test.</textarea></p>
<p><button id="button1">Speak</button>

  <script> 
// CC0 http://creativecommons.org/publicdomain/zero/1.0/
var jsApi = {"version" : "0.0.1", "developer" : "dev@mail.com"};
var apiStatus = AnkiDroidJS.init(JSON.stringify(jsApi));

button1.addEventListener("click", () => {
 const text = document.getElementById("textarea").value
 AnkiDroidJS.speak_experimental(text);
});
 </script>

Calling TTS seems to work very well. Please consider adding a JavaScript api. (Adding an api is too difficult for me because I don't have enough understanding of the code to estimate the changes)

krmanik commented 3 years ago

You should create PR for this, you have already implemented it. But first wait for maintainers of AnkiDroid to give feedback on this features.

I have also tried code like this. It works well. You may take some ideas from it. This function in AbstractFlashcardViewer.java The benefits of this will be custom language, tts engine, pitch and speed can be added/changed from user's js code.

 @JavascriptInterface
        public void ankiTextToSpeech(String text, String local) {
            ankiJStts = new TextToSpeech(getApplicationContext(), new TextToSpeech.OnInitListener() {
                @SuppressLint("NewApi")
                @Override
                public void onInit(int i) {
                    if (i == TextToSpeech.SUCCESS) {

                        int result = ankiJStts.setLanguage(new Locale(local));
                        if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) {
                            Timber.i("Not supported or missing language data.");

                        } else {
                            ankiJStts.speak(text, TextToSpeech.QUEUE_ADD, null, "0000000");
                        }

                    }
                }
            }, "com.google.android.tts");
        }

Usage:

<script> 
AnkiDroidJS.ankiTextToSpeech("你好!,世界", "zh_CN"); 
</script>
krmanik commented 3 years ago

Also this is old message in discord for this features from @mikehardy. that TTS stuff looks very interesting! If it works well and I wonder if we can "bend" the built-in / "core" TTS implemented in AnkiDroid to simply emit the necessary HTML/JS using that.

mikunimaru commented 3 years ago

The video below is the cloze Speak on the user side that I first commented on.

https://user-images.githubusercontent.com/43168745/117621579-b3904580-b1ac-11eb-95fb-00f9948d3ed2.mp4

<script> 
var jsApi = {"version" : "0.0.1", "developer" : "dev@mail.com"};
var apiStatus = AnkiDroidJS.init(JSON.stringify(jsApi));

var cloze = document.querySelector(".cloze"); 
AnkiDroidJS.speak_experimental(cloze.textContent);
</script>

If the JavaScript TTS api is implemented, AnkiDroid will become even more powerful. I'm a novice when it comes to JAVA, so I'd rather leave the creation of pull requests to users who are familiar with JAVA.

mikehardy commented 3 years ago

Sure, propose a PR exposing TTS features via javascript, that's fine by me - just please make them generic so they are generally useful. I think that's a small hurdle though - I believe the only things we really do with TTS are start/stop the engine, start/stop speaking, add text to the speaking queue (?), and set a default TTS language for a deck (?)

github-actions[bot] commented 3 years ago

Hello 👋, this issue has been opened for more than 2 months with no activity on it. If the issue is still here, please keep in mind that we need community support and help to fix it! Just comment something like still searching for solutions and if you found one, please open a pull request! You have 7 days until this gets closed automatically

david-allison commented 3 years ago

@mikunimaru FYI: We're in the process of accepting Anki Desktop style TTS.

https://github.com/ankidroid/Anki-Android/blob/253706a403cae606e6db455055a8b44e3b7ed662/AnkiDroid/src/main/java/com/ichi2/libanki/Sound.kt#L27-L37

I believe that this JS API is compatible with this class, but flagging it up in case we need to make changes before 2.16 goes out.