microsoft / psi

Platform for Situated Intelligence
https://github.com/microsoft/psi/wiki
Other
529 stars 92 forks source link

UWP MediaCapture: Option for NV-12 encoded images or BGRA images #274

Closed austinbhale closed 1 year ago

austinbhale commented 1 year ago

Hi, sorry opening a different pull request since I accidentally confused RGBA and BGRA formats, so I wanted to change the branch name to reflect that these are BGRA images. A working sample video can be found here: https://github.com/StereoKit/StereoKit/pull/574

I noticed that the ImageFromNV12StreamDecoder takes a significant amount of time for each NV12-encoded image. In terms of JPEG compression, it would be nice to skip this decoding prerequisite as mentioned here: https://github.com/microsoft/psi/issues/223#issuecomment-1111201702. The conversion to BGRA format with a software bitmap in C# is not the fastest solution on the HoloLens 2, but in my performance tests, it only took about 10 ms per image versus up to 100 ms of decoding time with the ImageFromNV12StreamDecoder. This might be a good option to have for devs, even though a native decoding of the NV-12 format would be even faster. Let me know what you think!

austinbhale commented 1 year ago

As an update, the images can also be converted to \psi's PixelFormat types, with the additions of RGBA and RGBX.

austinbhale commented 1 year ago

Glad to! Sorry for the delay. I've reset to the commit where it is purely for the BGRA format. I hadn't considered all of the trickling effects for the Imaging code, so I'll have that as a separate PR in the future. Appreciate your time!

chitsaw commented 1 year ago

/azp run

azure-pipelines[bot] commented 1 year ago
Azure Pipelines successfully started running 1 pipeline(s).
austinbhale commented 1 year ago

Hi, so one more note before merging this PR into master. I just realized it may be possible to specify the MediaEncodingSubtype as Bgra8 instead of going through the software bitmap conversion. I'll be testing this out today.

Overload reference: https://learn.microsoft.com/en-us/uwp/api/windows.media.capture.mediacapture.createframereaderasync?view=winrt-22621#windows-media-capture-mediacapture-createframereaderasync(windows-media-capture-frames-mediaframesource-system-string)

chitsaw commented 1 year ago

Sounds great! We'll hold off on merging this PR for now then. Let us know what you find. Thanks!

austinbhale commented 1 year ago

drumroll ... Yes, it is possible to specify the subtype ahead of time! Couple notes:

  1. https://github.com/microsoft/psi/pull/274/files#diff-6c969abe585f4024ec287ddf2885c70be57038e99194baef229c544cbb77a850R298 <- This line could be more accepting of other encoded types once the proper emitters and encoded image types are written for them. Ideally in the next PR.
  2. It's possible to set OutputEncodedImage and OutputImage to true from the settings, but it doesn't really seem practical why you'd want both. So if this were to happen, the EncodedImage would win.
  3. I noticed a significant performance gain by setting the MemoryPreference to Auto instead of CPU in https://github.com/Nakamir-Code/nakamir-psi/blob/feat/uwp-mediacapture-bgra-image/Sources/MixedReality/Microsoft.Psi.MixedReality.UniversalWindows/MediaCapture/PhotoVideoCamera.cs#L454, since it utilizes both the CPU and GPU. Could this PR also set this to Auto or do you all have reasoning for choosing purely CPU? I find the CPU runs out very quickly with a non-native format like BGRA rather than NV12.
chitsaw commented 1 year ago

drumroll ... Yes, it is possible to specify the subtype ahead of time! Couple notes:

That's awesome! Thanks for adding this.

  1. https://github.com/microsoft/psi/pull/274/files#diff-6c969abe585f4024ec287ddf2885c70be57038e99194baef229c544cbb77a850R298 <- This line could be more accepting of other encoded types once the proper emitters and encoded image types are written for them. Ideally in the next PR.

Sure, depending on which subtypes the device supports.

  1. It's possible to set OutputEncodedImage and OutputImage to true from the settings, but it doesn't really seem practical why you'd want both. So if this were to happen, the EncodedImage would win.

Could we throw a NotSupportedException instead to make the user aware that the current implementation doesn't support both?

  1. I noticed a significant performance gain by setting the MemoryPreference to Auto instead of CPU in https://github.com/Nakamir-Code/nakamir-psi/blob/feat/uwp-mediacapture-bgra-image/Sources/MixedReality/Microsoft.Psi.MixedReality.UniversalWindows/MediaCapture/PhotoVideoCamera.cs#L454, since it utilizes both the CPU and GPU. Could this PR also set this to Auto or do you all have reasoning for choosing purely CPU? I find the CPU runs out very quickly with a non-native format like BGRA rather than NV12.

Unfortunately, I think setting this to Cpu is required in order to populate the VideoMediaFrame.SoftwareBitmap property according to these remarks. Curious how you were able to get it to work with this set to Auto - were you not accessing the frame's SoftwareBitmap?

austinbhale commented 1 year ago

Could we throw a NotSupportedException instead to make the user aware that the current implementation doesn't support both?

Done! I hope the exception is clear. Let me know if you'd like it to be modified.

Unfortunately, I think setting this to Cpu is required in order to populate the VideoMediaFrame.SoftwareBitmap property according to these remarks. Curious how you were able to get it to work with this set to Auto - were you not accessing the frame's SoftwareBitmap?

Can confirm that the bitmap is null.. I must've had a bool turned off for that time. The performance gain surely explains it. :)

chitsaw commented 1 year ago

Thanks for your latest changes. Looks good!

chitsaw commented 1 year ago

/azp run

azure-pipelines[bot] commented 1 year ago
Azure Pipelines successfully started running 1 pipeline(s).