Closed Korbain closed 2 years ago
It's one more line to enable partial results in swift. Is it possible to make onPartialResultsCallback
method available for iOS as well as Android?
@Korbain figured out where to set the flag in xcode to do partial results! Just change the value on line 90 from NO
to YES
I'm not sure how to set this Flag in c#/unity though :(
Thanks! I’ll try in xcode and keep you posted.
Ok I set this flag in xcode but it is insufficient. I believe onResults is only called when stopRecording is called. Ie the onpartialresults callback is not implemented. Unfortunately, this is beyond my pay grade. If anyone knows how to do this end to end I would appreciate (for Unity).
Are you sure? If you look at the console in xcode, you can clearly see the partial text
Also, the SpeechRecorderViewController.mm file lives in the unity repo as well in case you'd rather make changes to the file in this repo as opposed to in the xcode build!
@Korbain I see what you're saying now!
So I don't think onResults() is the issue. The issue is that speech recognition is called in stopRecording(). If I'm able to figure out how to reorient the obj-c code to run speech recognition in startRecording() and check for result.isFinal then I'll do a pull request!
I'm not sure what your pay grade is, but see this github sample project with an example
Thanks a lot ! I will try it this weekend.
Hi, A few people have asked for my version, so here it is:
// // SpeechRecorderViewController.m // SpeechToText //
@interface SpeechRecorderViewController ()
{
// Speech recognize
SFSpeechRecognizer speechRecognizer;
SFSpeechAudioBufferRecognitionRequest recognitionRequest;
SFSpeechRecognitionTask recognitionTask;
// Record speech using audio Engine
AVAudioInputNode inputNode;
AVAudioEngine audioEngine;
NSString LanguageCode;
} @end
@implementation SpeechRecorderViewController
(id)init { audioEngine = [[AVAudioEngine alloc] init]; LanguageCode = @"ko-KR"; NSLocale *local =[[NSLocale alloc] initWithLocaleIdentifier:LanguageCode]; speechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:local];
//for (NSLocale *locate in [SFSpeechRecognizer supportedLocales]) { // NSLog(@"%@", [locate localizedStringForCountryCode:locate.countryCode]); //}
// Check Authorization Status // Make sure you add "Privacy - Microphone Usage Description" key and reason in Info.plist to request micro permison // And "NSSpeechRecognitionUsageDescription" key for requesting Speech recognize permison [SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus status) { //The callback may not be called on the main thread. Add an operation to the main queue to update the record button's state. dispatch_async(dispatch_get_main_queue(), ^{ switch (status) { case SFSpeechRecognizerAuthorizationStatusAuthorized: { NSLog(@"SUCCESS"); break; } case SFSpeechRecognizerAuthorizationStatusDenied: { NSLog(@"User denied access to speech recognition"); break; } case SFSpeechRecognizerAuthorizationStatusRestricted: { NSLog(@"User denied access to speech recognition"); break; } case SFSpeechRecognizerAuthorizationStatusNotDetermined: { NSLog(@"User denied access to speech recognition"); break; } } });
}];
return self; }
(void)SettingSpeech: (const char ) _language
{
LanguageCode = [NSString stringWithUTF8String:_language];
NSLocale local =[[NSLocale alloc] initWithLocaleIdentifier:LanguageCode];
speechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:local];
UnitySendMessage("SpeechToText", "onMessage", "Setting Success");
}
// recording
(void)startRecording { if (!audioEngine.isRunning) { [inputNode removeTapOnBus:0]; if (recognitionTask) { [recognitionTask cancel]; recognitionTask = nil; }
AVAudioSession *session = [AVAudioSession sharedInstance];
[session setCategory:AVAudioSessionCategoryPlayAndRecord mode:AVAudioSessionModeMeasurement options:AVAudioSessionCategoryOptionDefaultToSpeaker error:nil];
[session setActive:TRUE withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:nil];
[session setActive:TRUE withOptions:AVAudioSessionPortOverrideSpeaker error:nil];
inputNode = audioEngine.inputNode;
recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];
recognitionRequest.shouldReportPartialResults = YES;
recognitionTask =[speechRecognizer recognitionTaskWithRequest:recognitionRequest resultHandler:^(SFSpeechRecognitionResult * _Nullable result, NSError * _Nullable error)
{
if (result) {
NSString *transcriptText = result.bestTranscription.formattedString;
//NSLog(@"STARTRECORDING RESULT: %@", transcriptText);
if (result.isFinal) {
UnitySendMessage("SpeechToText", "onResults", [transcriptText UTF8String]);
}
else
{
UnitySendMessage("SpeechToText", "onPartialResults", [transcriptText UTF8String]);
}
}
else {
[audioEngine stop];
recognitionTask = nil;
recognitionRequest = nil;
UnitySendMessage("SpeechToText", "onResults", "nil");
//NSLog(@"STARTRECORDING RESULT NULL");
}
}];
AVAudioFormat *format = [inputNode outputFormatForBus:0];
[inputNode installTapOnBus:0 bufferSize:1024 format:format block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
[recognitionRequest appendAudioPCMBuffer:buffer];
}];
[audioEngine prepare];
NSError *error1;
[audioEngine startAndReturnError:&error1];
if (error1.description) {
NSLog(@"errorAudioEngine.description: %@", error1.description);
}
} }
(void)stopRecording { if (audioEngine.isRunning) { [inputNode removeTapOnBus:0]; [audioEngine stop]; [recognitionRequest endAudio]; if (recognitionTask) { [recognitionTask cancel]; } //NSLog(@"STOPRECORDING"); } }
@end
extern "C"{
SpeechRecorderViewController vc = [[SpeechRecorderViewController alloc] init];
void _TAG_startRecording(){
[vc startRecording];
}
void _TAG_stopRecording(){
[vc stopRecording];
}
void _TAG_SettingSpeech(const char _language){
[vc SettingSpeech:_language];
}
}
@Korbain Thanks a lot!
Hi,
Thanks for the excellent work. Could someone add support for partial results for IOS? I would like to have continuous speech recognition and with this method, it could be achieved. I would simply send a stop listening followed immediately by a start listening after 1 second or so of no changes in the partial results, indicating a pause in speaking.
Let me know. Thanks!