csdcorp / speech_to_text

A Flutter plugin that exposes device specific text to speech recognition capability.
BSD 3-Clause "New" or "Revised" License
388 stars 238 forks source link

Available locale throws an error when playing it (iOS) #428

Open cgutierr-zgz opened 1 year ago

cgutierr-zgz commented 1 year ago

When I try to recognize this language: ca-ES I get the following error:

Required assets are not available for Locale:ca-ES

Thing is that, if i load the locales, i get it as part of the list:

LocaleName: alemán (Alemania) - de-DE LocaleName: alemán (Austria) - de-AT LocaleName: alemán (Suiza) - de-CH LocaleName: cantonés (China continental) - yue-CN LocaleName: catalán (España) - ca-ES <- LocaleName: checo (Chequia) - cs-CZ ...

The system locale is: LocaleName: español (México) - es-MX (Phone was set to it for testing)


It's weird because sometimes it works one time, and the second I get this error, but usually it just doesn't work at all

sowens-csd commented 1 year ago

iOS or Android? So you have successfully used voice recognition for this language? If it never worked I'd suggest you make sure the right speech packs are downloaded to the device. But if it works properly sometimes then it sounds like an issue with the platform speech recognition services.

cgutierr-zgz commented 1 year ago

It's iOS, the packages are downloaded, it's pretty weird because I have implement my own native solution and recognized it fine, but in the plugin I get this error, usually works only on the first try (sometimes)

sowens-csd commented 1 year ago

Well that's an interesting result. With another language it works consistently but with this language it does work sometimes but not consistently?

cgutierr-zgz commented 1 year ago

Well that's an interesting result. With another language it works consistently but with this language it does work sometimes but not consistently?

For me it happens with this one, for a coworker it happens with German

We have all packages downloaded and when we call .locales() the languages used are actually part of the list, which I guess it's weird

sowens-csd commented 1 year ago

Right, but if you use a different language, it works consistently? And can you use a language that is not the system default language and have it work consistently? In other words is it just that one other language that is causing the issue or is trying to use any language that is not the default a problem?

cgutierr-zgz commented 1 year ago

Right, but if you use a different language, it works consistently? And can you use a language that is not the system default language and have it work consistently? In other words is it just that one other language that is causing the issue or is trying to use any language that is not the default a problem?

Oh sorry, yes, I've used many others languages consistently such as Arabic Spanish english and German

sowens-csd commented 1 year ago

The error is always the same?

Required assets are not available for Locale:ca-ES

But in your own native solution you can get that language to work properly on that device and consistently?

sowens-csd commented 1 year ago

This is somewhat interesting and possibly related https://developer.apple.com/forums/thread/703770

Are you using on device recognition explicitly?

sowens-csd commented 1 year ago

Also, if you wanted to point me at the code for your native solution I could compare it to the plugin implementation to try to spot the difference.

cgutierr-zgz commented 1 year ago

Hi @sowens-csd sorry I was away for the rest of the day. Yes, it works without a problem, Let me copy the code snippet

Click to expand code import Foundation import AVFoundation import Speech enum VoiceRecognitionNotAvailable: Error { case runtimeError(String) } @available(iOS 10.0, *) public class VoiceRecognitionPlugin: NSObject, FlutterPlugin { let avAudioSession: AVAudioSession = AVAudioSession.sharedInstance() let voiceRecognition: VoiceRecognition init(channel: FlutterMethodChannel) { self.voiceRecognition = VoiceRecognition(channel: channel) super.init() } public static func register(with registrar: FlutterPluginRegistrar) { let channel = FlutterMethodChannel(name: "recognition/voice", binaryMessenger: registrar.messenger()) registrar.addMethodCallDelegate(VoiceRecognitionPlugin(channel: channel), channel: channel) } public func handle(_ call: FlutterMethodCall, result: @escaping FlutterResult) { switch call.method { case "start_recognition": if let args = call.arguments as? Dictionary, let language = args["language"] as? String, let idleTime = args["idleTime"] as? Double, let soundEffectsEnabled = args["soundEffectsEnabled"] as? Bool, let recordingThresholdTime = args["recordingThresholdTime"] as? Double, let autoStopRecognition = args["autoStopRecognition"] as? Bool { do{ try voiceRecognition.startRecognition(language: language, idleTime: idleTime, autoStopRecognition: autoStopRecognition, recordingThresholdTime: recordingThresholdTime, soundEffectsEnabled: soundEffectsEnabled) result(nil) }catch VoiceRecognitionNotAvailable.runtimeError(let errorMessage){ result(FlutterError.init(code: errorMessage, message: nil, details: nil)); } catch{ result(FlutterError.init(code: "Unhandled exception", message: nil, details: nil)) } } else { result(FlutterError.init(code: "bad args", message: nil, details: nil)) } break; case "stop_recognition": if let args = call.arguments as? Dictionary, let soundEffectsEnabled = args["soundEffectsEnabled"] as? Bool { voiceRecognition.stopRecognition(soundEffectsEnabled: soundEffectsEnabled) result(nil) } break; case "get_device_languages": result(voiceRecognition.getDeviceLanguages()) break; case "abort_listening_services": voiceRecognition.abortListeningServices() result(nil) break; default: result(FlutterMethodNotImplemented); } } } @available(iOS 10.0, *) public class VoiceRecognition: NSObject { let voiceRecognitionChannel: FlutterMethodChannel let audioEngine = AVAudioEngine() var request: SFSpeechAudioBufferRecognitionRequest? var recognitionTask: SFSpeechRecognitionTask? var inputNode: AVAudioInputNode? var detectionTimer: Timer? var player: AVAudioPlayer? init(channel: FlutterMethodChannel) { voiceRecognitionChannel = channel request = nil } public func startRecognition(language: String, idleTime: Double, autoStopRecognition: Bool, recordingThresholdTime: Double, soundEffectsEnabled: Bool) throws{ let speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: language)) if #available(iOS 15.0, *){ if(!(speechRecognizer?.isAvailable ?? false)){ throw VoiceRecognitionNotAvailable.runtimeError("SFSpeechRecognizer not available") } } self.configureAudioSession() var didAutoStopRecognition: Bool = false; if soundEffectsEnabled { self.playAudio(audioName: "VoiceRecognitionActive.wav") } request = SFSpeechAudioBufferRecognitionRequest() request!.shouldReportPartialResults = true inputNode = audioEngine.inputNode let recordingFormat = inputNode?.inputFormat(forBus: 0) inputNode?.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, _) in self.request?.append(buffer) } audioEngine.prepare() try! audioEngine.start() if(!didAutoStopRecognition){ let intervalStopTime : Double = autoStopRecognition ? idleTime: recordingThresholdTime; self.detectionTimer?.invalidate() self.detectionTimer = Timer.scheduledTimer(withTimeInterval: intervalStopTime, repeats: false, block: { (timer) in didAutoStopRecognition = true; self.stopRecognition(soundEffectsEnabled: soundEffectsEnabled) }) } recognitionTask = speechRecognizer?.recognitionTask(with: request!) { (result, _) in if let transcription = result?.bestTranscription { self.voiceRecognitionChannel.invokeMethod("recognized_voice", arguments: transcription.formattedString) } if(autoStopRecognition && !didAutoStopRecognition){ self.detectionTimer?.invalidate() self.detectionTimer = Timer.scheduledTimer(withTimeInterval: idleTime, repeats: false, block: { (timer) in didAutoStopRecognition = true; self.stopRecognition(soundEffectsEnabled: soundEffectsEnabled) }) } } } private func playAudio (audioName: String){ let audioPath = Bundle.main.path(forResource: audioName, ofType: nil)! self.player = try! AVAudioPlayer(contentsOf: URL(fileURLWithPath: audioPath)) self.player?.play() } private func configureAudioSession() { do { try AVAudioSession.sharedInstance().setCategory(AVAudioSession.Category.playAndRecord, options: [.mixWithOthers, .defaultToSpeaker]) try AVAudioSession.sharedInstance().setActive(true) } catch {} } public func stopRecognition(soundEffectsEnabled: Bool) { if soundEffectsEnabled && audioEngine.isRunning { self.playAudio(audioName: "VoiceRecognitionDeactive.wav") } self.abortListeningServices() self.voiceRecognitionChannel.invokeMethod("stopped_recognition", arguments: nil) } public func abortListeningServices() { audioEngine.stop() inputNode?.removeTap(onBus: 0) request?.endAudio() recognitionTask?.cancel() request = nil } public func getDeviceLanguages() -> [[String: String?]] { let languages = SFSpeechRecognizer.supportedLocales() var availableLanguages: [[String: String?]] = [] for language in languages { availableLanguages.append([ "name": getLocaleName(language.identifier) ?? language.localizedString(forIdentifier: language.identifier), "iOSLanguageCode": language.languageCode, "iOSCode": language.identifier, "regionCode": language.regionCode ]) } return availableLanguages } private func getLocaleName(_ identifier: String) -> String? { let locale = NSLocale(localeIdentifier: "en_US") return locale.displayName(forKey: NSLocale.Key.identifier, value: identifier) } }