dlutton / flutter_tts

Flutter Text to Speech package
MIT License
598 stars 248 forks source link

Switching between languages is slow on iOS #418

Open mvrsn opened 1 year ago

mvrsn commented 1 year ago

💬 Questions and Help

In my app I have a setup where I would be playing text in multiple languages and for Android this works as expected, but for iOS the switching between the languages can be really slow in initializing the new language, and switching back and forth between the languages makes things feel very unresponsive

Is there some way that the languages can remain initialized so that it doesn't have to happen for each switch?

MrOnyszko commented 11 months ago

Oh, I can see, not only I have this issue.

I think the solution might be two instances of Speaker class in native iOS code. I am not 100% sure because the whole TTS on iOS behaves poorly. I will probably create TTS plugin for my app.

import Foundation
import SwiftUI
import AVFoundation
import os

@main
struct ttsApp: App {
    var body: some Scene {
        WindowGroup {            
            TwoLanguagesView(
                germanSpeaker: Speaker(),
                polishSpeaker: Speaker()
            )
        }
    }
}

class Speaker: NSObject, AVSpeechSynthesizerDelegate {
    let synthesizer = AVSpeechSynthesizer()

    override init() {
        super.init()
        synthesizer.delegate = self
    }

    func speak(msg: String, voice: AVSpeechSynthesisVoice) {
        let utterance = AVSpeechUtterance(string: msg)
        utterance.voice = voice
        synthesizer.speak(utterance)
    }
}

struct TwoLanguagesView: View {
    let germanSpeaker: Speaker
    let polishSpeaker: Speaker

    let polishText = "Cześć, to jest niesamowity dzień, i świetnie się bawię."
    let germanText = "Hallo, das ist ein großartiger Tag, und ich habe viel Spaß."

      var body: some View {
          VStack(spacing: 20) {
              Text("Polish:")
                  .font(.headline)
              Text(polishText)
                  .font(.body)
                  .padding()

              Button(action: {
                  germanSpeaker.speak(
                    msg: polishText,
                    voice: AVSpeechSynthesisVoice(
                        identifier: "com.apple.voice.enhanced.pl-PL.Zosia"
                    )!
                  )

              }) {
                  Text("Play Polish")
                      .foregroundColor(.white)
                      .padding()
                      .background(Color.blue)
                      .cornerRadius(10)
              }

              Text("German:")
                  .font(.headline)
              Text(germanText)
                  .font(.body)
                  .padding()

              Button(action: {
                  germanSpeaker.speak(
                    msg: germanText,
                    voice: AVSpeechSynthesisVoice(
                        identifier: "com.apple.voice.enhanced.de-DE.Anna"
                    )!
                  )
              }) {
                  Text("Play German")
                      .foregroundColor(.white)
                      .padding()
                      .background(Color.blue)
                      .cornerRadius(10)
              }
          }
          .padding()
      }
}
MrOnyszko commented 11 months ago

I can confirm that having multiple instances of AVSpeechSynthesizer wrapper class like Speaker solves the issue. I hold speaker instances in map and it works well. Initializing a second language takes some time, but I think it's possible to speak even empty string as soon as possible, so a later user won't see the delay.