microsoftgraph / microsoft-graph-comms-samples

Microsoft Graph Communications Samples
MIT License
208 stars 231 forks source link

[PolicyRecordingBot] How dominant speaker concept work in Graph communication media bot SDK? #329

Open Manojb86 opened 4 years ago

Manojb86 commented 4 years ago

Hi, How dominant speaker concept concept implemented in Skype media bot SDK? How it works and Use that in the application to recorded successful audio and video from media stream?

Who is the first dominant speaker? person who initiate the call? In a video or audio call, everytime a person who speaks in the call change dominant speaker change? and also sends media streams of that speaker? Then those media streams quality can be different speaker to speaker based on camera, mic and bandwidth quality. how to manage that?

nikskhubani commented 2 years ago

Hello,

As per my almost a year of experience handling calling bots, let me answer this:

  1. There is an event receiver on an audio socket to which you can subscribe in your code something like below:
private void OnDominantSpeakerChanged(object sender, DominantSpeakerChangedEventArgs e)
        {
            this.GraphLogger.Info($"[{this.Call.Id}:OnDominantSpeakerChanged(DominantSpeaker={e.CurrentDominantSpeaker})]");

            if (e.CurrentDominantSpeaker != DominantSpeakerNone)
            {
                IParticipant participant = this.GetParticipantFromMSI(e.CurrentDominantSpeaker);
                var participantDetails = participant?.Resource?.Info?.Identity?.User;
                if (participantDetails != null)
                {

                    // we want to force the video subscription on dominant speaker events
                    //// video code commented
                    // this.SubscribeToParticipantVideo(participant, forceSubscribe: true);
                }
            }
        }
  1. You can change the above code as per your business needs.
  2. After the bot joins the call, if let's say no one talks for 5 seconds, this event will NOT be called
  3. Let's say you spoke for one second after that, the above event shall be called with your participant details including MSI
  4. You will then do your business logic for said participant
  5. After a second you stopped talking, Microsoft will again call OnDominantSpeakerChanged saying that now there is NO dominant speaker. Which means e.CurrentDominantSpeaker will be DominantSpeakerNone
  6. Talking about media stream, you will keep getting a stream of ALL subscribed media (no matter if it's dominant or not). You have to write the logic of let's say ignoring another participant if you don't want their streams. You can also write the logic of subscribing to the ONLY dominant speakers and do not subscribe to others. Something like below

            ```

    if (participantDetails != null && (participantDetails?.Id != this.CurrentUserId || this.CurrentUserId == null)) { this.CurrentUserId = participant?.Resource?.Info?.Identity?.User?.Id;

                    // we want to force the video subscription on dominant speaker events
                    this.SubscribeToParticipantVideo(participant, forceSubscribe: true);
    
                    if (oldParticipant != default && oldParticipantMsi != e.CurrentDominantSpeaker)
                    {
                        this.UnsubscribeFromParticipantVideo(oldParticipant);
                    }
                }

Then those media streams quality can be different speaker to speaker based on camera, mic and bandwidth quality. how to manage that?

Yes, we inform to call what we want as a resolution using this line this.BotMediaStream.Subscribe(MediaType.Video, msi, VideoResolution.HD720p, socketId); however this is NOT guaranteed as it depends on the participant also. For example, if the participant has low bandwidth, the frames will be small in size. You will have to write a bridge to handle this received data and what you want.