janowsiany / cordova-plugin-ms-speech-service

Use the Microsoft Cognitive Services Speech SDK for iOS with Cordova/PhoneGap.
MIT License
2 stars 0 forks source link

Dynamically changing source and target language for translation #1

Open Marcophono2 opened 5 years ago

Marcophono2 commented 5 years ago

First of all: Great work! It's working fine but for my purpose I need the ability to set the source language and the target language from Javascript for

speech in source language ->translation->text in destination language

A nice addon would be if I also could set the region (like "westus" or "westeurope")

Best regards Marc

ivandroid commented 5 years ago

Same here. Nevertheless awesome work! How can one configure the language and region dynamically? Please help.

janowsiany commented 5 years ago

Hello @ivandroid and @Marcophono2 this would be a great addon, i will take care of that soon

ivandroid commented 5 years ago

According to the API, there is a property speechRecognitionLanguage in the SPXSpeechConfiguration class. Since I'm not familiar with Objective C, I couldn't figure out how to assign the device's default language to this property in the MicrosoftSpeechService.m file.

https://docs.microsoft.com/en-us/objectivec/cognitive-services/speech/spxspeechconfiguration

ivandroid commented 5 years ago

If you can't wait for Jan's changes as I do, you can change the API as follows:

MicrosoftSpeechService.m

- (void)recognizeOnce:(CDVInvokedUrlCommand*)command {
    if (speechRecognizer) {
        NSLog(@"Speech recognizer already set up");
        return;
    }
    NSString* key = [command argumentAtIndex:0];
    NSString* region = [command argumentAtIndex:1];
    NSString* language = [command argumentAtIndex:2];

    SPXSpeechConfiguration *speechConfig = [[SPXSpeechConfiguration alloc] initWithSubscription:key region:region];
    if (!speechConfig) {
        NSAssert(false, @"Could not load speech config");
        return;
    }

    speechConfig.speechRecognitionLanguage = language;

    speechRecognizer = [[SPXSpeechRecognizer alloc] init:speechConfig];
    if (!speechRecognizer) {
        NSAssert(false, @"Could not create speech recognizer");
        return;
    }

    NSLog(@"Speech recognizer set up");

    if (isRecognitionInProgress) {
        NSLog(@"Recognition already in progress");
        [self sendRecognitionError:(@"Recognition already in progress")];
        return;
    }

    isRecognitionInProgress = true;
    callback = command.callbackId;

    dispatch_async(dispatch_get_global_queue(QOS_CLASS_DEFAULT, 0), ^{
        [self recognizeOnce];
    });
}

msSpeechService.js

var exec = require('cordova/exec');

exports.recognizeOnce = function recognizeOnce(key, region, language, onResponse, onError) {
  exec(onResponse, onError, 'MicrosoftSpeechService', 'recognizeOnce', [key, region, language]);
};

Then you can call: window.cordova.plugins.msSpeechService.recognizeOnce(key, region, language, onResponse, onError)

Marcophono2 commented 5 years ago

Thanx, Ivan, great work! But for receiving a text translated to a dynamically to chose language is not possible in this way, isn't it?

Best regards Marc

ivandroid commented 5 years ago

Hello Marc. What do you mean? You can set the language dynamically this way. You would then receive an analysed speech result as object in the onResponse-callback.

Marcophono2 commented 5 years ago

Yes, that is correct. As far as you do not need a text translation that will work. Otherwise there are two languages:

  1. The language for recognizion
  2. the language into which the recognized text shall be translated

For example if a user in India uses the app he probably speaks Hindi but he shall have the choice, for example, to set "French" as the language he is speaking and "German" as the language for the translated text. As far as I understand your code you can set now "French" dynamically, staying at my example, but not "German" as the translation target language. Or am I wrong? (happened one or two times in the past ;)

Marcophono2 commented 5 years ago

@Ivan: Hey, are you from Bochum? I am next door. I am in Wetter an der Ruhr! :)

ivandroid commented 5 years ago

Yes, Marc, I am from Bochum. The world is small. :) I thought that the Microsoft’s SpeechSDK framework can just recognise spoken language and not translate it. By the way, there is a lot of other frameworks which can translate text.

Marcophono2 commented 5 years ago

I tested out a lot of frameworks but the MS one is currently really the best. So I want to go on with that one. But yes, indeed, it also can translate. :)

Marcophono2 commented 5 years ago

@Ivan: I googled your name because I was curious in which time zone you are living. Your name sounds to be russian but even in Moskow it's about 5am now. Okay, 2am like now in Germany is also not everyone's prefered working time... ;)