microsoft / MixedReality-WebRTC

MixedReality-WebRTC is a collection of components to help mixed reality app developers integrate audio and video real-time communication into their application and improve their collaborative experience
https://microsoft.github.io/MixedReality-WebRTC/
MIT License
910 stars 284 forks source link

Convert to I420A - Screen capturing with GraphicsCapturePicker #721

Open Mt-Perazim opened 3 years ago

Mt-Perazim commented 3 years ago

I know that this "Issues"-section is not meant for that. However, I don't know where I could ask something like that. And there are definitely developers here who are familiar with this matter. So please forgive me.

Setup I'm trying to capture my screen and send/communicate the stream via MR-WebRTC. Communication between two PCs or PC with HoloLens worked with webcams for me, so I thought the next step could be streaming my screen. So I took the uwp application that I already had, which worked with my webcam and tried to make things work:

So now I'm stuck in the following situation:

  1. I get a frame from the screen capturing, but its type is Direct3D11CaptureFrame. You can see it below in the code snipped.
  2. MR-WebRTC takes a frame type I420AVideoFrame (also in a code snipped).

How can I "connect" them?

Code Snipped Frame from Direct3D:

_framePool = Direct3D11CaptureFramePool.Create(
                _canvasDevice,                             // D3D device
                DirectXPixelFormat.B8G8R8A8UIntNormalized, // Pixel format
                3,                                         // Number of frames
                _item.Size);                               // Size of the buffers

_session = _framePool.CreateCaptureSession(_item);
_session.StartCapture();
_framePool.FrameArrived += (s, a) =>
{
    using (var frame = _framePool.TryGetNextFrame())
    {
        // Here I would take the Frame and call the MR-WebRTC method LocalI420AFrameReady  
    }
};

Code Snippet Frame from WebRTC:

// This is the way with the webcam; so LocalI420 was subscribed to
// the event I420AVideoFrameReady and got the frame from there
_webcamSource = await DeviceVideoTrackSource.CreateAsync();
_webcamSource.I420AVideoFrameReady += LocalI420AFrameReady;

// enqueueing the newly captured video frames into the bridge,
// which will later deliver them when the Media Foundation
// playback pipeline requests them.
private void LocalI420AFrameReady(I420AVideoFrame frame)
    {
        lock (_localVideoLock)
        {
            if (!_localVideoPlaying)
            {
                _localVideoPlaying = true;

                // Capture the resolution into local variable useable from the lambda below
                uint width = frame.width;
                uint height = frame.height;

                // Defer UI-related work to the main UI thread
                RunOnMainThread(() =>
                {
                    // Bridge the local video track with the local media player UI
                    int framerate = 30; // assumed, for lack of an actual value
                    _localVideoSource = CreateI420VideoStreamSource(
                        width, height, framerate);
                    var localVideoPlayer = new MediaPlayer();
                    localVideoPlayer.Source = MediaSource.CreateFromMediaStreamSource(
                        _localVideoSource);
                    localVideoPlayerElement.SetMediaPlayer(localVideoPlayer);
                    localVideoPlayer.Play();
                });
            }
        }
        // Enqueue the incoming frame into the video bridge; the media player will
        // later dequeue it as soon as it's ready.
        _localVideoBridge.HandleIncomingVideoFrame(frame);
    }
KarthikRichie commented 3 years ago

I'm not sure on the conversion to I420A, you can use the conversion to Argb32VideoFrame to achieve the same. Here is the snippet. You can directly query your Direct3D11CaptureFramePool and convert to Argb32VideoFrame as and when you get a frame request to be sent to remote peer. Hope this helps

        // Setting up external video track source
            _screenshareSource = ExternalVideoTrackSource.CreateFromArgb32Callback(FrameCallback);        

    struct WebRTCFrameData
    {
        public IntPtr Data;
        public uint Height;
        public uint Width;
        public int Stride;
    }

   public void FrameCallback(in FrameRequest frameRequest)
    {
        try
        {
            if (FramePool != null)
            {
                using (Direct3D11CaptureFrame _currentFrame = FramePool.TryGetNextFrame())
                {
                    if (_currentFrame != null)
                    {
                        WebRTCFrameData webRTCFrameData = ProcessBitmap(_currentFrame.Surface).Result;
                        frameRequest.CompleteRequest(new Argb32VideoFrame()
                        {
                            data = webRTCFrameData.Data,
                            height = webRTCFrameData.Height,
                            width = webRTCFrameData.Width,
                            stride = webRTCFrameData.Stride
                        });
                    }
                }
            }
        }
        catch (Exception ex)
        {
        }
    }

    private async Task<WebRTCFrameData> ProcessBitmap(IDirect3DSurface surface)
    {

        SoftwareBitmap softwareBitmap = await SoftwareBitmap.CreateCopyFromSurfaceAsync(surface, Windows.Graphics.Imaging.BitmapAlphaMode.Straight);

        byte[] imageBytes = new byte[4 * softwareBitmap.PixelWidth * softwareBitmap.PixelHeight];
        softwareBitmap.CopyToBuffer(imageBytes.AsBuffer());
        WebRTCFrameData argb32VideoFrame = new WebRTCFrameData();
        argb32VideoFrame.Data = GetByteIntPtr(imageBytes);
        argb32VideoFrame.Height = (uint)softwareBitmap.PixelHeight;
        argb32VideoFrame.Width = (uint)softwareBitmap.PixelWidth;

        var test = softwareBitmap.LockBuffer(BitmapBufferAccessMode.Read);
        int count = test.GetPlaneCount();
        var pl = test.GetPlaneDescription(count - 1);
        argb32VideoFrame.Stride = pl.Stride;

        return argb32VideoFrame;

    }

    private IntPtr GetByteIntPtr(byte[] byteArr)
    {
        IntPtr intPtr2 = System.Runtime.InteropServices.Marshal.UnsafeAddrOfPinnedArrayElement(byteArr, 0);
        return intPtr2;
    }
Mt-Perazim commented 3 years ago

Thank you very much for the answer! I will try this today.

Mt-Perazim commented 3 years ago

Thank you very much! You have helped me a lot with this. The transfer is now working (for now the stream works only from A to B, but not vice versa, but I'll find out the reason).

I didn't know that SoftwareBitmapcan be used for this or rather, that this method must be used for this purpose and the method GetByteIntPtris magic for me.

janosfichter commented 3 years ago

Thank you also from my side - that helped a lot to get the capturing running.

I have one question: I registered the callback on the senders side which is returning a Argb32VideoFrame, and the LocalVideoTrack shows also Argb32 for its FrameEncoding property. But after creating the track on the Receivers side (being introduced by the VideoTrackAdded event) it shows I420A for the FrameEncoding. I actually converted the frames on the receivers side to RGB and can confirm that it indeed is received with I420 encoding.

Any explanation on why this is happening? I would like to avoid any conversions to keep latency as low as posssible.

Thats my SDP data (sender --> receiver): v=0 o=- 8514878755156339479 2 IN IP4 127.0.0.1 s=- t=0 0 a=group:BUNDLE 0 a=msid-semantic: WMS m=video 9 UDP/TLS/RTP/SAVPF 98 c=IN IP4 0.0.0.0 a=rtcp:9 IN IP4 0.0.0.0 a=ice-ufrag:UKzY a=ice-pwd:3TT3OUvNz8WU6rnMkEKWlHXq a=ice-options:trickle a=fingerprint:sha-256 CE:21:D8:1D:79:E6:4D:10:81:64:F7:50:D5:CC:E9:02:FE:3D:F7:55:A9:16:E6:D7:17:CC:60:90:B6:2E:28:33 a=setup:actpass a=mid:0 a=extmap:2 urn:ietf:params:rtp-hdrext:toffset a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time a=extmap:4 urn:3gpp:video-orientation a=extmap:5 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01 a=extmap:6 http://www.webrtc.org/experiments/rtp-hdrext/playout-delay a=extmap:7 http://www.webrtc.org/experiments/rtp-hdrext/video-content-type a=extmap:8 http://www.webrtc.org/experiments/rtp-hdrext/video-timing a=extmap:10 http://tools.ietf.org/html/draft-ietf-avtext-framemarking-07 a=extmap:9 urn:ietf:params:rtp-hdrext:sdes:mid a=sendrecv a=msid:- 9c5ed19a-9b67-4d5a-ab2c-a938ff0b62e2 a=rtcp-mux a=rtcp-rsize a=rtpmap:98 VP9/90000 a=rtcp-fb:98 goog-remb a=rtcp-fb:98 transport-cc a=rtcp-fb:98 ccm fir a=rtcp-fb:98 nack a=rtcp-fb:98 nack pli a=fmtp:98 x-google-profile-id=0 a=ssrc-group:FID 4054647876 3353276698 a=ssrc:4054647876 cname:cEEodkPJync6OXIF a=ssrc:4054647876 msid: 9c5ed19a-9b67-4d5a-ab2c-a938ff0b62e2 a=ssrc:4054647876 mslabel: a=ssrc:4054647876 label:9c5ed19a-9b67-4d5a-ab2c-a938ff0b62e2 a=ssrc:3353276698 cname:cEEodkPJync6OXIF a=ssrc:3353276698 msid: 9c5ed19a-9b67-4d5a-ab2c-a938ff0b62e2 a=ssrc:3353276698 mslabel: a=ssrc:3353276698 label:9c5ed19a-9b67-4d5a-ab2c-a938ff0b62e2

KarthikRichie commented 3 years ago

@janosfichter From my understanding, Mixed reality converts the Argb32VideoFrame to I420AVideoFrame before it is sent to remote peer and that's the reason you seeing this in receivers side. And yes this comes at a cost. I would love to hear if there is way to convert the frames from Direct3DFrameCapturePool to I420AVideoFrame directly so that we can avoid an unnecessary conversion in Mixed reality webrtc

Matteo-0 commented 3 years ago

Hello, I tryed to follow a bit what you have done to capture your screen because I need to do the same but when it comes to define the _framepool:

_framePool = Direct3D11CaptureFramePool.Create( _canvasDevice, // D3D device DirectXPixelFormat.B8G8R8A8UIntNormalized, // Pixel format 3, // Number of frames _item.Size); // Size of the buffers

I get the following error: System.AccessViolationException: 'Attempted to read or write protected memory. This is often an indication that other memory is corrupt.'

I do not know what to do. I defined the item as:

        var picker = new GraphicsCapturePicker();
        GraphicsCaptureItem item = await picker.PickSingleItemAsync();
        _item = item;

Hope can you help me.

Matteo-0 commented 3 years ago

Hi sorry, I solved the problem. Now I am able to capture the screen, but I am not able to send the frames to unity through WebRTC, in unity I get an orange screen. What can I do?

Mizkoeu commented 3 years ago

@Matteo-0 Curious how you solved the problem? We're running into the same AccessViolationException and not sure how to proceed or debug since it's likely related to one of the native dlls used. Thanks!