dotnet / wpf

WPF is a .NET Core UI framework for building Windows desktop applications.
MIT License
7.06k stars 1.17k forks source link

Improve allocation with System.Buffers #769

Open Mrnikbobjeff opened 5 years ago

Mrnikbobjeff commented 5 years ago

I mainly looked through windowbase and there are various array allocations to read stream or similar. Other small arrays are so small that they could be stackallocated as they are constant length. For a perfect example check CFStream. These allocations could be removed with System.Buffers which would not have a detrimental impact on anything as far as i can tell. I created a branch with the possible conversions in the WindowBase project. The improvements remove several immediate LOH allocations which stick around for a long time, in some instances I decided to leave the byte array allocation as the direct usage of these arrays required zeroed arrays. We could of course clear them after getting them from the ArrayPool, but then we would have to consider the tradeoff between allocation(cheap&fast) vs pool obtaining(cheap&fast) and zeroing the array (more expensive than allocation)

miloush commented 5 years ago

AFAIK System.Buffers types are in a separate assembly, so keep in mind this would a) introduce a new dependency b) potentially have performance impact when the assembly is loaded.

It is cool that you got good results with CFStream, but how often is CFStream used?

Mrnikbobjeff commented 5 years ago

System.Buffers and with that ArrayPool is a part of .Net Core 3. If it resides in a different assembly I don't know, but at least no dependency. CFStream was only one instance where I noticed this, though CFStream usage is rare there are other examples such as array allocation in marshalling code for strings which will be called plenty. Even if we would decide against improving CFStream because it is rarely used we have plenty of potential to do this in other more frequently used code

Mrnikbobjeff commented 5 years ago

Okay, after some initial auditing I found several Instances where using a pooled array has no downside and is easily feasable: https://github.com/dotnet/wpf/blob/1d4abe6b45ed1a1e9fe605ae1c9949d6d2a99fb4/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/dataobject.cs#L3257 https://github.com/dotnet/wpf/blob/1d4abe6b45ed1a1e9fe605ae1c9949d6d2a99fb4/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Ink/StrokeCollection.cs#L154 https://github.com/dotnet/wpf/blob/1d4abe6b45ed1a1e9fe605ae1c9949d6d2a99fb4/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Media/ColorContext.cs#L613 https://github.com/dotnet/wpf/blob/1d4abe6b45ed1a1e9fe605ae1c9949d6d2a99fb4/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Media/Imaging/BitmapDecoder.cs#L1331 https://github.com/dotnet/wpf/blob/1d4abe6b45ed1a1e9fe605ae1c9949d6d2a99fb4/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Media/Imaging/BitmapSource.cs#L1882

Stackalloc candidates: 2 Bytes, needs method inline. Should be easy, only one call https://github.com/dotnet/wpf/blob/1d4abe6b45ed1a1e9fe605ae1c9949d6d2a99fb4/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Input/TextCompositionManager.cs#L441 8 bytes, also 8 byte short array allocation that can be stackalloced safely: https://github.com/dotnet/wpf/blob/1d4abe6b45ed1a1e9fe605ae1c9949d6d2a99fb4/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Media/ColorTransform.cs#L207

On a seperate note, while looking through the source I found other small optimizations, each listed here with context: Easy really small optimization, eliminate the null check here by changing it to fixed(byte *pbData = &buffer[0]). This would probably benefit from being audited seperately, as there are a ton of these throughout the code where we can guarantee that the fixed object is not null. This optimization is already present in BitmapSource. https://github.com/dotnet/wpf/blob/1d4abe6b45ed1a1e9fe605ae1c9949d6d2a99fb4/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Media/ByteStreamGeometryContext.cs#L446 Same here: https://github.com/dotnet/wpf/blob/1d4abe6b45ed1a1e9fe605ae1c9949d6d2a99fb4/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Media/ColorContext.cs#L657 A square ton of these checks are redundant here: https://github.com/dotnet/wpf/blob/1d4abe6b45ed1a1e9fe605ae1c9949d6d2a99fb4/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Media/Geometry.cs#L988 Finally, I am curious whether ignoring the return value of bytes read is intended here: https://github.com/dotnet/wpf/blob/1d4abe6b45ed1a1e9fe605ae1c9949d6d2a99fb4/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/dataobject.cs#L1670 Might be a security issue. This is just a brief overview of pooling candidates, we can eliminate some LOH allocations and some small byte allocations while marshalling strings and streams. If this is a desirable change I would love to start working on it!

YoshihiroIto commented 4 years ago

I am interested in the progress of this discussion. I am trying to improve the drawing efficiency ( at PresentationCore ).

UIElement repeats buffer allocation par drawing. Especially during animation, it is executed a lot in a short time.

Specifically, it is RenderData.EnsureBuffer processing. It is inefficient because buffer allocation is performed for each update.

The changes are local and not a problem. I think ArrayPool is the best way to solve this problem.

What do you think?

https://github.com/dotnet/wpf/blob/7b9ffd263a845e57c506d3c08509b955d3d97907/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Media/RenderData.cs#L466-L496

Mrnikbobjeff commented 4 years ago

You would have to profile it to see whether this is a hot path, and whether the usage of RenderData actually permits the buffer to be a pooled buffer. That I can not answer. I believe that using buffers here might not actually speed things up very much thoguh, as array creation is nearly free speed wise. I would wager that the majority of time in that function is spent in the CopyTo function. The exception would be if the amount of allocations from this method is so high that the time spent renting and returning pools is less than the time spent in GC because of this garbage(which I don't think is the case, but you would need to profile to be sure)

deeprobin commented 2 years ago

It would make sense to use stackallocation in many places and to use the ArrayPool for things that cannot be stack-allocated.

lindexi commented 2 years ago

I am interested in the progress of this discussion. I am trying to improve the drawing efficiency ( at PresentationCore ).

UIElement repeats buffer allocation par drawing. Especially during animation, it is executed a lot in a short time.

Specifically, it is RenderData.EnsureBuffer processing. It is inefficient because buffer allocation is performed for each update.

The changes are local and not a problem. I think ArrayPool is the best way to solve this problem.

What do you think?

https://github.com/dotnet/wpf/blob/7b9ffd263a845e57c506d3c08509b955d3d97907/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Media/RenderData.cs#L466-L496

@YoshihiroIto Sorry for delay, I fixed it in https://github.com/dotnet/wpf/pull/5392