How to set maximum speaker for real time diarization?

u7630991 commented 7 months ago

[Enter feedback here]

How to set maximum speaker for real time diarization?

Document Details

⚠ Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

ID: 1b3ffa53-4033-103c-9988-7a28be2cfc91
Version Independent ID: 97c4b578-c048-f2aa-7abb-d06d805d1c34
Content: Real-time diarization quickstart - Speech service - Azure AI services
Content Source: articles/ai-services/speech-service/get-started-stt-diarization.md
Service: azure-ai-speech
GitHub Login: @eric-urban
Microsoft Alias: eur

PesalaPavan commented 7 months ago

@u7630991 Thanks for your feedback! We will investigate and update as appropriate.

AjayBathini-MSFT commented 7 months ago

@u7630991

To set the maximum number of speakers for real-time diarization in the Microsoft Speech Service, you can use the SpeakerRecognitionConfig class in the Speech SDK. Specifically, you can set the NumberOfChannels property of the SpeakerRecognitionConfig object to the maximum number of speakers that you want to support.

Here is an example of how to set the maximum number of speakers to 2 in C#:

// Create a SpeakerRecognitionConfig object with the maximum number of speakers set to 2 var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion"); config.SpeechRecognitionLanguage = "en-US"; config.EnableDiarization = true; config.DiarizationNumberOfSpeakers = 2;

// Create a SpeechRecognizer object with the SpeakerRecognitionConfig using (var recognizer = new SpeechRecognizer(config)) { // Start recognition and wait for a result var result = await recognizer.RecognizeOnceAsync(); Console.WriteLine(result.Text); } In this example, the DiarizationNumberOfSpeakers property of the SpeakerRecognitionConfig object is set to 2, which means that the real-time diarization will attempt to identify up to 2 speakers in the audio stream.

u7630991 commented 7 months ago

@AjayBathini-MSFT Thank for your reply, but I can still get more than 2 speaker for real time diarization. I have set up the transcriber in python as below:

    self.speech_key = os.getenv("AZURE_SPEECH_KEY")
    self.speech_region = "australiaeast"
    self.speech_config = speechsdk.SpeechConfig(subscription=self.speech_key, region=self.speech_region)
    self.speech_config.speech_recognition_language = "en-US"

    self.speech_config.EnableDiarization = True
    self.speech_config.DiarizationNumberOfSpeakers = 2

    # Setup the stream for audio input
    self.stream = speechsdk.audio.PushAudioInputStream()
    self.audio_config = speechsdk.audio.AudioConfig(stream=self.stream)

    # Create a speech transcriber using the audio config
    self.transcriber = speechsdk.transcription.ConversationTranscriber(speech_config=self.speech_config, audio_config=self.audio_config)

AjayBathini-MSFT commented 7 months ago

@u7630991 I'd recommend working closer with our support team via an [Azure support request] (https://docs.microsoft.com/en-us/azure/azure-portal/supportability/how-to-create-azure-support-request). We'll follow up there.

MicrosoftDocs / azure-docs

How to set maximum speaker for real time diarization? #121815

How to set maximum speaker for real time diarization?

Document Details