Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.83k stars 1.83k forks source link

Newer keyword models won't load in iOS sdk #2564

Open Lirry18 opened 1 month ago

Lirry18 commented 1 month ago

Error during Wakeword Listen: 0x5 (SPXERR_INVALID_ARG) Exception with an error code: 0x5 (SPXERR_INVALID_ARG) [CALL STACK BEGIN]

3 MicrosoftCognitiveServicesSpeech 0x00000001036e28a0 GetModuleObject + 2417632 4 MicrosoftCognitiveServicesSpeech 0x00000001036e41a8 GetModuleObject + 2424040 5 MicrosoftCognitiveServicesSpeech 0x00000001034c9564 GetModuleObject + 217252 6 MicrosoftCognitiveServicesSpeech 0x00000001034b52ac GetModuleObject + 134636 7 MicrosoftCognitiveServicesSpeech 0x00000001034b4bec GetModuleObject + 132908 8 MicrosoftCognitiveServicesSpeech 0x00000001034b4700 GetModuleObject + 131648 9 MicrosoftCognitiveServicesSpeech 0x00000001034a30a4 GetModuleObject + 60388 10 MicrosoftCognitiveServicesSpeech 0x00000001034a1018 GetModuleObject + 52056 11 MicrosoftCognitiveServicesSpeech 0x00000001034a06b0 GetModuleObject + 49648 12 MicrosoftCognitiveServicesSpeech 0x00000001034c28c0 GetModuleObject + 189440 13 MicrosoftCognitiveServicesSpeech 0x00000001034bc4b4 GetModuleObject + 163828 14 MicrosoftCognitiveServicesSpeech 0x00000001035e6a04 GetModuleObject + 1385796 15 MicrosoftCognitiveServicesSpeech 0x00000001035e6990 GetModuleObject + 1385680 16 MicrosoftCognitiveServicesSpeech 0x00000001035e8830 GetModuleObject + 1393520 17 MicrosoftCognitiveServicesSpeech 0x00000001035e6b68 GetModuleObject + 1386152 18 MicrosoftCognitiveServicesSpeech 0x00000001035e99a4 GetModuleObject + 1397988 19 libsystem_pthread.dylib 0x00000001ee2ec06c _pthread_start + 136 [CALL STACK END]

func loadWakeword() throws -> Void {
        do {
            let str = WAKEWORD_MODEL.components(separatedBy: ".")[0]
            let model = URL(fileURLWithPath:Bundle.main.path(forResource: str, ofType: "table")!);
            wakewordModel = try SPXKeywordRecognitionModel(fromFile: model.path);
            _reloadWakewordRecognizer()
        } catch let error{
            throw RuntimeError("Error during Wakeword Load: " + error.localizedDescription)
        }

private func wakewordCancelled(recog: SPXKeywordRecognizer, eventArg: SPXKeywordRecognitionCanceledEventArgs) {
        switch eventArg.reason {
        case SPXCancellationReason.error:
            print("Error during Wakeword Listen: " + eventArg.errorDetails!);
// THIS WAS TRIGGERED IN THE ERROR
            try! startWakeword(flutterResult: { _ in return });
            break;
}

Describe the bug

When I load in a keyword from the Speech Studio in my iOS application, the keyword does not work. We have multiple keywords, and all the other ones work. The last one I made was 2 months ago, and that one is working fine, even when I re-download it and place it in my XCode. I tried with another model I made 2 days ago, that one was not working either. That is why I suspect it might be something on the Speech Studio side.

I tried with the SDK we are currently using (1.35) and the updated one (1.38). It seems that 1.40 is not available for iOS yet.

Note: on Android, the keyword worked out of the box, even in SDK 1.30, so I do not think it is the model itself.

To Reproduce

Steps to reproduce the behavior:

  1. Create a keyword with the Speech Studio, use the Advanced model.
  2. Try to load it in an iOS app, with the code provided, or use a sample code

Version of the Cognitive Services Speech SDK

Platform, Operating System, and Programming Language

BrianMouncer commented 1 month ago

There was an trunk push failure I did not notice earlier. I've republished 1.40.0 again, so you should be able to try it now. That is unlerated the the above issue, but thanks for pointing out this missing package release.

pankopon commented 1 month ago

@Lirry18 The error is probably due to a format change for advanced keyword models at least. Support for the new format was originally included in SDK releases but withdrawn due to complaints about the binary size growth, now it's only available in the "embedded speech" package that contains non-default functionality. However, documentation seems not have been updated to reflect this.

Please download the package from https://aka.ms/csspeech/iosbinaryembedded that currently points to MicrosoftCognitiveServicesSpeech-EmbeddedXCFramework-1.40.0.zip and build your application using that instead of the default package.

Lirry18 commented 1 month ago

@pankopon , Thanks I will try that out then! Will this be in the official SDK 1.40 as well?

pankopon commented 1 month ago

@Lirry18 MicrosoftCognitiveServicesSpeech-EmbeddedXCFramework-1.40.0.zip is an official 1.40.0 release package, it's just an alternative to the default MicrosoftCognitiveServicesSpeech-XCFramework-1.40.0.zip (from https://aka.ms/csspeech/iosbinary) due to the mentioned binary size difference for keyword support and because most customers don't use embedded speech.

github-actions[bot] commented 1 week ago

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.