Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.81k stars 1.82k forks source link

Speech recognition for conjunctions without punctuation to join a long sentence. #2500

Closed pengcheng933 closed 1 month ago

pengcheng933 commented 2 months ago
    SPXSpeechConfiguration *speechConfig = nil;
    @try {
        NSError *startError2 = nil;
        speechConfig = [[SPXSpeechConfiguration alloc] initWithAuthorizationToken:token region:service error:&startError2];
        if(startError2 != nil){
            NSLog(@"err is %@", startError2);
        }
        [speechConfig setSpeechRecognitionLanguage:lang];
    } @catch (NSException *exception) {
        NSLog(@"Error: %@", exception.reason);
    }

    if (!speechConfig) {
        NSLog(@"Could not load speech config");
        return;
    }
    SPXAudioConfiguration *audioConfig = [[SPXAudioConfiguration alloc] initWithMicrophone:0];
    recognizer = [[SPXSpeechRecognizer alloc] initWithSpeechConfiguration:speechConfig audioConfiguration:audioConfig];

    [recognizer addRecognizedEventHandler:^(SPXSpeechRecognizer *recognizer, SPXSpeechRecognitionEventArgs *eventArgs) {
        NSString *recognizedText = eventArgs.result.text;
        NSLog(@"Final recognized text: %@", recognizedText);
        [speechToTextResult appendFormat:recognizedText];
        self->isRecognize = false;
        [self endIdentification];
    }];
    
    [recognizer addRecognizingEventHandler:^(SPXSpeechRecognizer *recognizer, SPXSpeechRecognitionEventArgs *eventArgs) {
        self->isRecognize = true;
        NSString *intermediateText = eventArgs.result.text;
        NSLog(@"Intermediate recognized text: %@", intermediateText);
    }];

    [recognizer addCanceledEventHandler:^(SPXSpeechRecognizer *recognizer, SPXSpeechRecognitionCanceledEventArgs *eventArgs) {
        NSLog(@"Recognition canceled. Reason: %ld", (long)eventArgs.reason);
        if (eventArgs.errorDetails != nil) {
            NSLog(@"Error details: %@", eventArgs.errorDetails);
            self->isError = true;
            self->isRecognize = false;
        }
    }];
}

Describe the bug

Put this into Deepl and read it out, use the SDK to go to speech to text, the result is missing punctuation, just the hyphenation is not there, I check the API and there is no introduction about this, can the SDK add the hyphenation symbols correctly

To Reproduce

Steps to reproduce the behavior:

  1. Put this into Deepl and play it! “Indeed, my MBTI personality is INFJ, so I prefer to stay at home, and I am not very energetic, but I will let myself go out every day, which will make my mood much better.”
  2. The result is returned as follows “Indeed, my MBTI personality is INFJ so I prefer to stay at home and I am not very energetic but I will let myself go out every day which will make my mood much better.”, you can see that the punctuation is missing, but not completely.

Version of the Cognitive Services Speech SDK

1.38.0

Platform, Operating System, and Programming Language

pankopon commented 1 month ago

Hi, just to be clear, are you expecting to get a result

pengcheng933 commented 1 month ago

I hope the punctuation can match the Deepl input exactly. If this is too difficult, I would also like the text to be segmented based on speech pauses, as this affects the understanding of the recognized language. Additionally, I have discovered two points:

1.When I switched from Deepl's voice playback to speaking myself, I found that if I paused long enough between phrases, punctuation could be added to segment the text. 2.Does Microsoft's custom speech recognition feature solve my problem? Customized Speech Recognition

pankopon commented 1 month ago

Custom speech may be worth trying in your case, there is information about custom text formatting for the output. Otherwise the only sure way to have punctuation exactly where you want it is to enable dictation and dictate punctuation ("comma", "period" etc.).

pengcheng933 commented 1 month ago

Thanks for the reply, will try later!

github-actions[bot] commented 1 month ago

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.

pankopon commented 1 month ago

Closed as answered.