Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.8k stars 1.83k forks source link

Retrieve VoiceProfile from Server for later identification/verification #808

Closed roninstar closed 3 years ago

roninstar commented 3 years ago

I have tried everything I could think of to perform identification or verification on our saved users after their voices have been trained without training on start of the app every time. Even if I save the GUID's of the voiceprofiles once they are created I can not get the voiceprofiles from the Azure service in order to perform verification on the users identity on a later date. The examples provided make it so that a user has to do voice training every single time they start an application. How can I do a verification on one enrolled user after voice training without having to Create a brand new profile every time. https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-speaker-recognition

public async Task VerificationEnroll(SpeechConfig config, Dictionary<string, string> profileMapping) {

    var client = new VoiceProfileClient(config);

    _voiceProfile = await client.CreateProfileAsync(VoiceProfileType.TextIndependentVerification, "en-us");

    var audioInput = AudioConfig.FromDefaultMicrophoneInput();

    CurrentInstructions.Text = $"Enrolling profile id {_voiceProfile.Id}.";
    // give the profile a human-readable display name

            _profileMapping.Add(_voiceProfile.Id, "Person's Name");

            VoiceProfileEnrollmentResult result = null;
            while (result is null || result.RemainingEnrollmentsSpeechLength > TimeSpan.Zero)
            {
                CurrentInstructions.Text = "Continue speaking to add to the profile enrollment sample.";
                result = await client.EnrollProfileAsync(_voiceProfile, audioInput);
                CurrentInstructions.Text = $"Remaining enrollment audio time needed: {result.RemainingEnrollmentsSpeechLength}";

            }

            if (result.Reason == ResultReason.EnrolledVoiceProfile)
            {
                await SpeakerVerify(config, _voiceProfile, _profileMapping);
            }
            else if (result.Reason == ResultReason.Canceled)
            {
                var cancellation = VoiceProfileEnrollmentCancellationDetails.FromResult(result);
                CurrentInstructions.Text = $"CANCELED {_voiceProfile.Id}: ErrorCode={cancellation.ErrorCode} ErrorDetails={cancellation.ErrorDetails}";
            }

}

public async Task SpeakerVerify(SpeechConfig config, VoiceProfile profile, Dictionary<string, string> profileMapping) { try { var speakerRecognizer = new SpeakerRecognizer(config, AudioConfig.FromDefaultMicrophoneInput()); var users = speakerRecognizer.Properties.GetProperty(PropertyId.SpeechServiceResponse_JsonResult); var model = SpeakerVerificationModel.FromProfile(profile);

        CurrentInstructions.Text = "Speak the passphrase to verify: \"My voice is my passport, please verify me.\"";
        TimeSpan.FromSeconds(0);
        var result = await speakerRecognizer.RecognizeOnceAsync(model);
        CurrentInstructions.Text = $"Verified voice profile for speaker {profileMapping[result.ProfileId]}, score is {result.Score}";
    }
    catch(Exception ex)
    {
       // throw ex;
    }
}
brandom-msft commented 3 years ago

Hi @roninstar, thanks for the question - I've reached out to our team expert about this and will reply back with my findings.

lisaweixu commented 3 years ago

There's a known bug in the the Speech SDK that requires the profile type be specified when constructing a profile from an ID. This will be addressed in a near future release of the Speech SDK. The workaround is using the RESTful API at here.

roninstar commented 3 years ago

Hi Lisa, the issue we are having is there is no way to get the user profile. There isn't any method to get the voice profile from he server using the SDK. Is this going to be addressed in the future?

roninstar commented 3 years ago

so basically if you do voice training with Speaker Recognition and you create the profile and set the type as text-independent you still have no way to perform speaker identification later on. Even if you save the guid there is no way to get and use it without having to do voice training on start of the app each time.

oscholz commented 3 years ago

@roninstar , we addressed this in the 1.16 release of the Speech SDK. Please give it a try. See our release notes for more info.