microsoft / microsoft-ui-xaml

Windows UI Library: the latest Windows 10 native controls and Fluent styles for your applications
MIT License
6.29k stars 675 forks source link

SoftwareBitmapSource.SetBitmapAsync throws uncatchable exception in some circumstances #2386

Open benstevens48 opened 4 years ago

benstevens48 commented 4 years ago

Describe the bug I've come across a (rare) scenario where SoftwareBitmapSource.SetBitmapAsync throws an unhandled Win32 exception (which can't be caught by the app). The background is slightly complicated: I am creating a SoftwareBitmap from a thumbnail embedded in an AVIF image file and displaying it using a SoftwareBitmapSource. I'm using WIC to extract the thumbnail. There is obviously a bug in the decoder somewhere (it uses the Microsoft HEIF decoder with the AV1 video extension installed), because whenever I try to do anything with the IWICBitmapSource like convert it to an IWICBitmap and try to lock the pixels, it returns a failing HRESULT (invalid argument or something). Obviously that's something for the WIC team to fix, but the point is that SoftwareBitmapSource.SetBitmapAsync should look out for any exceptions like this and throw them in such a way that they can be caught, instead of ending up in a situation where an unhandled Win32 exception is thrown. Since SoftwareBitmapSource.SetBitmapAsync may be used as part of an image viewing app to show images provided by the end user, it should be resilient to any bad images or codec errors (providing the codec itself doesn't crash). Note that it's not enough to assume that the SoftwareBitmap provided is completely valid, because when you load a bitmap, according to my heuristic tests, the pixels are not actually copied from the IWICBitmapSource until the moment they are actually needed which can be quite late in the pipeline.

Here is the error message I get

Unhandled exception at 0x00007FFBB9DCA7B8 (Windows.UI.Xaml.dll) in MyApp.exe: 0xC000027B: An application-internal exception has occurred (parameters: 0x000001AE83A6B950, 0x0000000000000001).
Windows 10 version Saw the problem?
November 2019 Update (18363) Yes
codendone commented 4 years ago

@benstevens48 I'm not currently able to repro this issue. What language is your app written in? Can you try looking at the crash or a crash dump in WinDbg and then use !pde.dse to dump the stowed exceptions so we can see the error code and stack for the failure? !pde is available here in the PDE*.zip. (It is linked off the channel9 page above.) Put the appropiate x64 or x86 dll in that zip in the winext directory of a WinDbg install, and then "!pde.dse" should work on stowed exception crash dumps. (Alternatively, if you prefer to share a crash dump, I can look at that.)

benstevens48 commented 4 years ago

@codendone - I didn't really expect you to be able to repro it - attached is a minimal solution to repro it including the image that causes the crash. It' s mixture of C# for the UI and and a C++/WinRT component. Note that the image is a random image I found on the internet.

SoftwareBitmapSourceCrash.zip

codendone commented 4 years ago

Thanks for the repro!

The crash in that repro appears to be caused by a bug in the HEIF decoder, as suggested in your original post: the decoder computes the scaled bounds of the image using integer math in one place and floating point math in another, and then was unhappy that the bounds differed after the floating point values were rounded. I logged a bug on the HEIF decoder team for this (internal link).

Of course, this is still separate from your original point. I agree bad image files should not cause the app to crash, and should instead fire the Image/BitmapImage.ImageFailed event. That event usually does fire for image failures. I'm not sure if there are other failures where bad images cause an uncatchable exception in a similar stack as this HEIF failure. It may be possible with other custom uses of WIC, and it may also be possible for app code to wrap one of the WIC objects to catch and eat the error (though bubbling out the fact that an error was handled to the app would probably be challenging).

If this is a significant issue, we could add some handling in WinUI 3 to bubble out errors in this stack. This would be a non-trivial change, since this background thread code has long expected and required the imaging code to succeed at this part of the process (when we're copying bits out of the successfully decoded image).

Here is the stack where this HEIF failure was returned, which is in a function where XAML doesn't expect failure:

RTMediaFrame!SoftwareBitmapImpl::LockBuffer
Windows_UI_Xaml!AsyncCopyToSurfaceTask::CopyOperation
Windows_UI_Xaml!AsyncCopyToSurfaceTask::Execute
Windows_UI_Xaml!AsyncImageFactory::WorkCallback
Windows_UI_Xaml!CWinWorkItem::Execute
Windows_UI_Xaml!CWinWorkItem::WorkCallback
ntdll!TppWorkpExecuteCallback
ntdll!TppWorkerThread
KERNEL32!BaseThreadInitThunk
ntdll!__RtlUserThreadStart
ntdll!_RtlUserThreadStart
benstevens48 commented 4 years ago

Thanks for the info. Is it not possible to throw an HRESULT which can be passed through the ABI boundary instead of ending up with an unhandled win32 exception that my C# code can't catch? I always use the SetSourceAsync, SetBitmapAsync etc style methods inside a try-catch block in my C# code, which ought to be able to catch any exceptions. I feel like this is a must. Any unhandled win32 exception which is not caused directly by my own C++ code should be considered a bug.

I guess there is a separate question about the ImageFailed etc event which may be more difficult to handle properly. I personally don't mind about these being raised instead of catchable exception being thrown, so I'll leave that decision up to you.

benstevens48 commented 4 years ago

Having made that last point about the ImageFailed event, I guess for people that use lots of image assets it's quite bad if the app crashes just because one image fails to load correctly. Even though the assets are known at design time, so this is less of an issue, there is still the possibility of the developer having a fixed version of the codec installed and the client an old broken version, which could result in an unexpected crash at runtime. But this is still lower priority than getting rid of the unhandled win32 exception in favour of an HRESULT.

codendone commented 4 years ago

Apps can reference images on the web, so not crashing on a bad image is definitely the intent. The ImageFailed event provides a way to know and respond to bad images, even when the image was only specified via Uri.

This HEIF failure happens in code on a background thread where we previously haven't seen failures, even for various bad images. We can make this code capture the HRESULT and appropriately route it back through the relevant code on the UI thread, but this won't be a small fix and can't be done before WinUI 3.

If this is a significant issue for you which needs a solution soon, we can work with the HEIF team to see if it is possible to write a wrapper in your native code to allow you to catch this failure.

benstevens48 commented 4 years ago

Thanks for the offer of help with a workaround. It's not a massive issue for me - I've only managed to find that one image that causes a problem. I was thinking that the easiest fix would be for me to somehow check if the IWICBitmap is valid, but this pipeline in my app is used to generate hundreds of thumbnails on the fly and I don't want to do anything to slow it down, so I'm not sure about calling Lock although I haven't tested the performance. I tried querying the pixel format but that didn't return an error. Maybe there's another function I could call to check validity with minimal performance impact?

On a slightly related note, will things like SoftwareBitmapSource be open-sourced with WinUI 3? When I heard the initial WinUI 3 announcement I was looking forward to that, but subsequently I think I've read that the low-level parts will not be open-source. Apart from the opportunity to fix the above issue, another reason I am interested is to write a high-performance, low memory version of SoftwareBitmapSource. I am happy to recreate my images in a device lost scenario, and based on some testing, I think that SoftwareBitmapSource probably keeps a copy of the original bitmap, so I could at least halve the memory usage and perhaps remove a copy operation. I am wondering how to do this myself using a composition drawing surface, but I'm not sure of the most efficient way - it would be best probably to copy the pixels directly from a WIC bitmap into the surface, but I don't think that's possible. Plus I have to write a custom control in order to display the surface instead of using my existing image control. Plus I'm not sure how best to re-use composition graphics devices or just to create a new one every time.

codendone commented 4 years ago

The HEIF bug looks to be specific to the Lock function. I understand your reasonable concern about performance, and I'm also not sure if other parts of the decode process need to have run in order for Lock to hit this issue (since in this case the decoder had information saying it should scale down by a factor of 16, which might have been computed based on the size XAML wanted to decode to).

Yes, SoftwareBitmapSource will be open-sourced with WinUI 3, since that object is part of XAML. But SoftwareBitmap will not. So the stack here will only be partially open-source, split at the XAML boundary. XAML's handling (or lack thereof) of this failure in AsyncCopyToSurfaceTask::CopyOperation will be part of the open-source code. So this issue would be something you could fix.

Regarding performance, SoftwareBitmapSource holds on to the SoftwareBitmap so that we can ask it for the bits again if there is a device lost or a need to redecode it for a different render size. The bits could be held in system memory if the SoftwareBitmap holds on to the bits (which I think depends on how it is created, including the behavior of any objects used to create it). It is certainly possible for your code to control the process end-to-end by using a CompositionDrawingSurface or a SurfaceImageSource, but those can have some costs as well, such as the cost of a separate Direct3D device, the cost of a separate graphics memory surface (compared to sharing the atlas(es) XAML uses), and not decoding to render size (unless you know the size and handle this, of course).

benstevens48 commented 2 years ago

@codendone - it seems the same bug now also occurs if you try to load an AVIF image in a SoftwareBitmapSource and the AV1 Video Extension package is not installed. Again, this is a fail fast exception which it is impossible to catch.

benstevens48 commented 1 year ago

I assume this still needs fixing in WinUI3.

fm-sys commented 9 months ago

For me this line produces the error (and I'm not able to catch it):

StorageItemThumbnail thumbnail = await storageFile.GetThumbnailAsync(ThumbnailMode.PicturesView);

I'm loading thumbnails of user pictures, so I cannot make sure there is no bad one.

Seeing that this rather problematic bug exists since 3 years now, can you estimate if and when there will be a fix?