khanhuitse05 / speech-and-text-unity-ios-android

Speed to text in Unity iOS use Native Speech Recognition

MIT License

288 stars 124 forks source link

Implement Partial Results for iOS #29

Closed Korbain closed 2 years ago

Korbain commented 3 years ago

Hi,

Thanks for the excellent work. Could someone add support for partial results for IOS? I would like to have continuous speech recognition and with this method, it could be achieved. I would simply send a stop listening followed immediately by a start listening after 1 second or so of no changes in the partial results, indicating a pause in speaking.

Let me know. Thanks!

calvma commented 3 years ago

It's one more line to enable partial results in swift. Is it possible to make onPartialResultsCallback method available for iOS as well as Android?

calvma commented 3 years ago

@Korbain figured out where to set the flag in xcode to do partial results! Just change the value on line 90 from NO to YES

calvma commented 3 years ago

I'm not sure how to set this Flag in c#/unity though :(

Korbain commented 3 years ago

Thanks! I’ll try in xcode and keep you posted.

Korbain commented 3 years ago

Ok I set this flag in xcode but it is insufficient. I believe onResults is only called when stopRecording is called. Ie the onpartialresults callback is not implemented. Unfortunately, this is beyond my pay grade. If anyone knows how to do this end to end I would appreciate (for Unity).

calvma commented 3 years ago

Are you sure? If you look at the console in xcode, you can clearly see the partial text

calvma commented 3 years ago

Also, the SpeechRecorderViewController.mm file lives in the unity repo as well in case you'd rather make changes to the file in this repo as opposed to in the xcode build!

calvma commented 3 years ago

@Korbain I see what you're saying now!

So I don't think onResults() is the issue. The issue is that speech recognition is called in stopRecording(). If I'm able to figure out how to reorient the obj-c code to run speech recognition in startRecording() and check for result.isFinal then I'll do a pull request!

I'm not sure what your pay grade is, but see this github sample project with an example

calvma commented 3 years ago

DONE https://github.com/PingAK9/Speech-And-Text-Unity-iOS-Android/pull/32

Korbain commented 3 years ago

Thanks a lot ! I will try it this weekend.

Korbain commented 3 years ago

Hi, A few people have asked for my version, so here it is:

// // SpeechRecorderViewController.m // SpeechToText //

import "SpeechRecorderViewController.h"

import <Speech/Speech.h>

@interface SpeechRecorderViewController () {
// Speech recognize SFSpeechRecognizer speechRecognizer; SFSpeechAudioBufferRecognitionRequest recognitionRequest; SFSpeechRecognitionTask recognitionTask; // Record speech using audio Engine AVAudioInputNode inputNode; AVAudioEngine audioEngine; NSString LanguageCode;

} @end

@implementation SpeechRecorderViewController

(id)init { audioEngine = [[AVAudioEngine alloc] init]; LanguageCode = @"ko-KR"; NSLocale *local =[[NSLocale alloc] initWithLocaleIdentifier:LanguageCode]; speechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:local];

//for (NSLocale *locate in [SFSpeechRecognizer supportedLocales]) { // NSLog(@"%@", [locate localizedStringForCountryCode:locate.countryCode]); //}

// Check Authorization Status // Make sure you add "Privacy - Microphone Usage Description" key and reason in Info.plist to request micro permison // And "NSSpeechRecognitionUsageDescription" key for requesting Speech recognize permison [SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus status) { //The callback may not be called on the main thread. Add an operation to the main queue to update the record button's state. dispatch_async(dispatch_get_main_queue(), ^{ switch (status) { case SFSpeechRecognizerAuthorizationStatusAuthorized: { NSLog(@"SUCCESS"); break; } case SFSpeechRecognizerAuthorizationStatusDenied: { NSLog(@"User denied access to speech recognition"); break; } case SFSpeechRecognizerAuthorizationStatusRestricted: { NSLog(@"User denied access to speech recognition"); break; } case SFSpeechRecognizerAuthorizationStatusNotDetermined: { NSLog(@"User denied access to speech recognition"); break; } } });

}];

return self; }
(void)SettingSpeech: (const char ) _language {
LanguageCode = [NSString stringWithUTF8String:_language]; NSLocale local =[[NSLocale alloc] initWithLocaleIdentifier:LanguageCode]; speechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:local]; UnitySendMessage("SpeechToText", "onMessage", "Setting Success"); } // recording

(void)startRecording { if (!audioEngine.isRunning) { [inputNode removeTapOnBus:0]; if (recognitionTask) { [recognitionTask cancel]; recognitionTask = nil; }

AVAudioSession *session = [AVAudioSession sharedInstance];
[session setCategory:AVAudioSessionCategoryPlayAndRecord mode:AVAudioSessionModeMeasurement options:AVAudioSessionCategoryOptionDefaultToSpeaker error:nil];
[session setActive:TRUE withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:nil];
[session setActive:TRUE withOptions:AVAudioSessionPortOverrideSpeaker error:nil];

inputNode = audioEngine.inputNode;

recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];
recognitionRequest.shouldReportPartialResults = YES;
recognitionTask =[speechRecognizer recognitionTaskWithRequest:recognitionRequest resultHandler:^(SFSpeechRecognitionResult * _Nullable result, NSError * _Nullable error)
{
    if (result) {
        NSString *transcriptText = result.bestTranscription.formattedString;
        //NSLog(@"STARTRECORDING RESULT: %@", transcriptText);
        if (result.isFinal) {
            UnitySendMessage("SpeechToText", "onResults", [transcriptText UTF8String]);
        }
        else
        {
            UnitySendMessage("SpeechToText", "onPartialResults", [transcriptText UTF8String]);
        }
    }
    else {
        [audioEngine stop];
        recognitionTask = nil;
        recognitionRequest = nil;
        UnitySendMessage("SpeechToText", "onResults", "nil");
        //NSLog(@"STARTRECORDING RESULT NULL");
    }
}];

AVAudioFormat *format = [inputNode outputFormatForBus:0];

[inputNode installTapOnBus:0 bufferSize:1024 format:format block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
    [recognitionRequest appendAudioPCMBuffer:buffer];
}];
[audioEngine prepare];
NSError *error1;
[audioEngine startAndReturnError:&error1];

if (error1.description) {
    NSLog(@"errorAudioEngine.description: %@", error1.description);
}

} }

(void)stopRecording { if (audioEngine.isRunning) { [inputNode removeTapOnBus:0]; [audioEngine stop]; [recognitionRequest endAudio]; if (recognitionTask) { [recognitionTask cancel]; } //NSLog(@"STOPRECORDING"); } }

@end extern "C"{ SpeechRecorderViewController vc = [[SpeechRecorderViewController alloc] init]; void _TAG_startRecording(){ [vc startRecording]; }
void _TAG_stopRecording(){ [vc stopRecording]; }
void _TAG_SettingSpeech(const char _language){ [vc SettingSpeech:_language]; }
}

Znoleg commented 3 years ago

@Korbain Thanks a lot!