[DWrite] Crash caused by interaction between native inline objects, TextRenderer.DrawInlineObject, and Configuration.EnableReleaseOnFinalizer.

Jayonas commented 6 years ago

If you have the Configuration.EnableReleaseOnFinalizer option enabled and an InlineObject whose implementation is native is passed into a managed implementation of TextRenderer.DrawInlineObject, then there ends up being a ref counting mismatch that leads to more decrements than increments on the underlying IDWriteInlineObject, which eventually tends to result in an access violation when attempting to access/delete the inline object after it has already been deleted.

Here's what I think is happening: DWrite calls IDWriteTextRenderer::DrawInlineObject and SharpDX implements that by wrapping the IDWriteInlineObject in a managed InlineObject before passing it on to the caller's TextRenderer.DrawInlineObject implementation. Neither me or SharpDX explicitly do anything to affect its ref count (e.g. calling Dispose) in that process, which is exactly as desired. However, when Configuration.EnableReleaseOnFinalizer is enabled, SharpDX calls Dispose on the InlineObject when it is finalized, which in turn decrements the ref count on the underlying IDWriteInlineObject without any corresponding increment ever having happened.

A complicating factor is that this situation doesn't really have an effect if the InlineObject passed to TextRenderer.DrawInlineObject has a managed implementation rather than a native one. Managed objects aren't ref counted, and they're only finalized when no other managed code can reference them anyway, so their being "released" on finalization doesn't really have any noticeable effect.

The reason that this is problematic even for SharpDX users who never deal with native code is that EllipsisTrimming wraps a natively implemented inline object. So anyone that wants to use DWrite's built in ellipsis trimming may encounter this issue if they've also implemented their own TextRenderer and use Configuration.EnableReleaseOnFinalizer.

I wasn't able to find any good way to detect the exact combination of circumstances that cause this problem and implement a general solution. My specific workaround has been to make sure that I never use an inline object whose implementation is native, which meant that I had to stop using EllipsisTrimming/IDWriteFactory::CreateEllipsisTrimmingSign. I replaced it by creating my own managed implementation of InlineObject which wraps a TextLayout in the same way that the native implementation of the ellipsis trimming object does.

(Note that I'm basing this off my usage of SharpDX 3.1.1, which I realize is out of date, but it also doesn't look like much has happened in the DWrite code since then so I think it's likely that this is still an issue.)

Jayonas commented 6 years ago

I've recently updated to SharpDX 4.2.0 and have some updated information about this issue.

Turns out that my workaround -- only use managed implementations of IDWriteInlineObject -- leaked memory because ComObjectShadow never freed its GCHandle, which I'd expect that it should do either when its ref count reached zero (in 3.1.1 it was the thing that maintained the ref count for callbackables) or when it was finalized (because I have Configuration.EnableReleaseOnFinalizer enabled), but it actually literally did nothing when its ref count reached zero, and it never looked at that configuration option. On the other hand, the fact that it didn't do anything when its ref count reached zero is the very reason that my workaround avoided the crash. Since I couldn't find a way to fix the memory leak without re-introducing the crash, and since it looked like significant improvements were made to the ref count implementation in newer SharpDX versions, I finally bit the bullet and tried updating.

Unfortunately, the update broke my workaround and didn't actually fix the original crash either. It broke my workaround because it now correctly frees/releases/disposes of things when the ref count reaches zero. That's good, but it brought me back to square one for investigating the original crash. The good news is that I now have some better specifics and a better workaround.

Whether or not an InlineObject implementation is native or managed no longer matters. All that matters is: I've implemented my own TextRenderer, I have Configuration.EnableReleaseOnFinalizer enabled, and I'm using inline objects. When TextRenderer.DrawInlineObject is called, the concrete type passed in is an InlineObjectNative, which just wraps the native pointer passed in from native DWrite. This InlineObjectNative never added its own ref, but it does release a ref when it is finalized, due to Configuration.EnableReleaseOnFinalizer being true! So sometime after each call to TextRenderer.DrawInlineObject, the InlineObjectNative that was passed in would be finalized, disposed, and release a ref that's owned by something else, which unsurprisingly leads to trouble.

One thing that hasn't changed from 3.1.1, though, is that CallbackBase (which now manages the ref count in 4.2.0) still doesn't check the Configuration.EnableReleaseOnFinalizer option, and so still doesn't release its ref automatically upon finalization as expected. That complicated the investigation a bit, because the extra AddRef from that offset one of the extra Releases from the InlineObjectNative problem.

So the end result of my new investigation was to pinpoint two issues that occur when using Configuration.EnableReleaseOnFinalizer:

Every time TextRenderer.DrawInlineObject is called, the drawn object incorrectly loses a ref.
Any CallbackBase that isn't explicitly Disposed will leak a ref/memory.

With the root causes pinned down, new workarounds are pretty straightforward:

Call AddReference on the passed InlineObject in TextRenderer.DrawInlineObject.
Always explicitly call Dispose on any CallbackBase derivatives.

Additionally, I can go back to using the EllipsisTrimming wrapper around the native trimmer object rather than rolling my own managed implementation of it.

Unfortunately, suggestions for actual permanent fixes in SharpDX aren't quite so obvious. Calling AddReference on InlineObjectNative automatically before passing it to the TextRenderer seems like an obvious fix, but you only want to do it if Configuration.EnableReleaseOnFinalizer is true. But then what happens if the user changes that config setting between that AddReference call and when the object is finalized? Maybe InlineObjectNative needs some additional policy options regarding how it handles ref counting.

Likewise, it would seem obvious that CallbackBase should just pay attention to Configuration.EnableReleaseOnFinalizer the same way that ComObject does, but I think that may end up having unintended consequences. When I was fully immersed in this investigation I feel like I discovered some reason that it was good that CallbackBase was ignoring Configuration.EnableReleaseOnFinalizer but I can't remember what that reason was or if it's still valid, and I don't have anything in my notes about it. Hopefully someone more knowledgeable about SharpDX internals could say with more conviction whether or not such a change would be dangerous.

jkoritzinsky commented 6 years ago

Hey I just saw your update. Thanks for investigating this! I think you're on the right track on both fronts of the issue. CallbackBase should respect the configuration knobs. For the InlineObjectNative stuff, would adding a reference before passing it into TextRenderer and then disposing of the instance after the call work for you as a user? Then the code would behave correctly with and without ReleaseOnFinalizer turned on.

Jayonas commented 6 years ago

Yeah, that's a good idea to just explicitly dispose it after the call. Along with the corresponding AddRef, I think that would solve the issue with InlineObjectNative and should "just work" from the user standpoint, so it sounds great to me.

I'll defer to you on updating CallbackBase to respect the configuration. It does seem like it should; I just wish I could remember what caused me to think that it might be good that it doesn't. Hopefully if there's some good reason then testing will make it obvious when you try changing it.

Thanks for any work you can do on this!

sharpdx / SharpDX

[DWrite] Crash caused by interaction between native inline objects, TextRenderer.DrawInlineObject, and Configuration.EnableReleaseOnFinalizer. #1011