[iOS] How to fix an issue where my 3D Blendshapes do not align with the audio.

In addVisemeReceivedEventHandler, I receive event.animation. I want to use Viseme 3D Blend Shapes to drive my 3D Avatar.

Here is an example JSON: { "FrameIndex": 0, "BlendShapes": [ [0.021, 0.321, ..., 0.258], [0.045, 0.234, ..., 0.288], ... ] } However, in the first round, I couldn't get the JSON; I got a warning: print("Error parsing JSON: (error)") In the second round, I could get the JSON and set the weights for my 3D model. It works, but it doesn't align with the audio.

Can someone help me fix this issue?

Thank you in advance.

This my code:

func synthesisToSpeaker() {
        guard let subscriptionKey = sub, let region = region else {
            print("Speech key and region are not set.")
            return
        }
        
        var speechConfig: SPXSpeechConfiguration?
        do {
            try speechConfig = SPXSpeechConfiguration(subscription: subscriptionKey, region: region)
        } catch {
            print("Error creating speech configuration: \(error)")
            return
        }
        
        speechConfig?.speechSynthesisVoiceName = "en-US-AvaMultilingualNeural"
        speechConfig?.setSpeechSynthesisOutputFormat(.raw16Khz16BitMonoPcm)
        
        guard let synthesizer = try? SPXSpeechSynthesizer(speechConfig!) else {
            print("Error creating speech synthesizer.")
            return
        }
        
        let ssml = """
               <speak version='1.0' xml:lang='en-US' xmlns='http://www.w3.org/2001/10/synthesis'
                          xmlns:mstts='http://www.w3.org/2001/mstts'>
                          <voice name='en-US-CoraNeural'>
                    <mstts:viseme type='FacialExpression'/>
                 Hello World, May I help you?
                </voice>
               </speak>
               """
        
        // Subscribe to viseme received event
        synthesizer.addVisemeReceivedEventHandler { (synthesizer, event) in
            self.mapBlendshapesToModel(jsonString: event.animation,
                                       node: self.contentNode)
           //print("\(event.animation)")
        }
        
        do {
            let result = try synthesizer.speakSsml(ssml)
            
            switch result.reason {
            case .recognizingSpeech:
                print("Synthesis recognizingSpeech")
            case .recognizedSpeech:
                print("Synthesis recognizedSpeech")
            case .synthesizingAudioCompleted:
                print("Synthesis synthesizingAudioCompleted")
            default:
                print("Synthesis failed: \(result.description)")
            }
        } catch {
            debugPrint("speakSsml failed")
        }
    }

func mapBlendshapesToModel(jsonString: String, node: SCNNode?) {
        guard let jsonData = jsonString.data(using: .utf8) else {
            print("Invalid JSON Data")
            return
        }
        
        guard let node = node else {
            print("Node is nil")
            return
        }
        
        do {
            let json = try JSONSerialization.jsonObject(with: jsonData, options: [])
            if let dictionary = json as? [String: Any] {
                if let frameIndex = dictionary["FrameIndex"] as? Int,
                   let blendShapes = dictionary["BlendShapes"] as? [[Double]] {
                    //setup my 3d
                }
            }
        } catch {
            print("Error parsing JSON: \(error)")
        }
    }

Azure-Samples / cognitive-services-speech-sdk

[iOS] How to fix an issue where my 3D Blendshapes do not align with the audio. #2481