dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.18k stars 4.72k forks source link

Double.CompareTo() performance hurt by not being inlined #56493

Closed rickbrew closed 2 years ago

rickbrew commented 3 years ago

I was profiling some Paint.NET code in WPA the other day and noticed that Double.CompareTo() was taking a non-trivial amount of the execution time:

image

The scenario here in the app is to select an area of the canvas (Rectangle Select, Ellipse Select, Magic Wand, whatever), switch to the Move Selected Pixels too, and do some transforms on it (rotate, scale, etc.).

The selection is represented by a poly-polygon (IReadOnlyList<Point2Double[]>) which is then scan-converted into an alpha mask (bitmap). Scan-conversion is the standard algorithm for doing this and involves lots of sorting, at one point by Y values and then repeatedly by X values. Double.CompareTo() is used quite a bit here, in other words. In the screenshot, you can see that the most time is spent in 1) filling an alpha mask bitmap from the list of scans/rectangles from the scan-conversion code (MaskFromScansRenderer) (this code is vectorized quite a bit now, lots of buffer filling), 2) rendering the transformed pixels onto the current layer's bitmap (TransformedNearestNeighborContentRenderer), 3) some other rendering/compositing and hit-testing (MoveToolContentRenderer), then 4) System.Double::CompareTo() (!!!), and then most of the rest is the actual scan-conversion, sorting, and some more buffer copying.

To investigate further, I crafted up a toy benchmark in LINQPad where I timed how long it took to sort an array of 10 million randomly generated doubles (10 iterations per benchmark). All the usual perf tricks were employed, including pointers to elide bounds checks, and an optimized copy of Sort<T, TList, TComparer>() where everything is structs, interfaces, and tagged with AggressiveInlining. For the first benchmark, the stock Double.CompareTo() is used, and for the second benchmark a copy of it with [MethodImpl(MethodImplOptions.AggressiveInlining)] is employed:

My results:

image

(you can ignore the CompareNoNaN result, it was just for fun to see what happens if I remove the branches that test for NaN)

More than twice as fast -- indicating that the overhead of not inlining Double.CompareTo() is substantial.

My hypothesis here is that adding [MethodImpl(MethodImplOptions.AggressiveInlining)] to Double.CompareTo() (and probably Single.CompareTo()) would be an easy and impactful win. I suspect that these methods aren't used extensively throughout the runtime or other codebases, so the codegen size impact should be small, and where they are used the performance gain would be worth it.

I'm happy to craft up a proper benchmark with real data that others can inspect. I'm submitting this issue so I stop procrastinating on it :)

cc @EgorBo, @tannergooding , who were part of the conversation about this on Discord #lowlevel

dotnet-issue-labeler[bot] commented 3 years ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

AraHaan commented 3 years ago

@rickbrew what is the perf of adding inlining to Single.CompareTo?

rickbrew commented 3 years ago

@rickbrew what is the perf of adding inlining to Single.CompareTo?

For the sorting benchmark the results were near identical. Same amount of time, same amount of % improvement.

EgorBo commented 3 years ago

A few notes: 1) StandardOptimizationData.mibc profile that we ship seems to be missing a lot of FP-related methods, e.g. I couldn't find info for any of the Double API, we need some training scenario for those (e.g. WPF?) 2) Double.CompareTo is inlineable in DynamicPGO mode in hot call-sites: image

I personally don't mind just adding AggressiveInlining for it, Double is often involved in intensive calculations. However, again, we can do it via StandardOptimizationData.mibc

danmoseley commented 3 years ago

Maybe Paint.NET should be a training scenario somehow?

rickbrew commented 3 years ago

We can certainly arrange that, @richlander and I already have something in place to enable that sort of thing

danmoseley commented 3 years ago

In the short term, since that can't happen immediately, would it make sense to mark aggressive inlining (temporarily?) on Double and Short?

rickbrew commented 3 years ago

In the short term, since that can't happen immediately, would it make sense to mark aggressive inlining (temporarily?) on Double and Short?

Sure, here's a PR for review etc. #56501

AraHaan commented 3 years ago

I feel like the benchmarks you made should be added to it as well (if possible) so that way they can be used in the future for future comparison.

danmoseley commented 3 years ago

If you mean microbenchmarks (like the ones we have in dotnet/performance) it's hard to use them to measure any "benefit" from aggressive inlining. It almost inevitably makes microbenchmarks faster, although in real apps it is not necessarily a benefit.

It would be worth checking that we have good enough coverage of Compare() there, but only to protect the implementation itself. https://github.com/dotnet/performance/tree/main/src/benchmarks/micro

richlander commented 3 years ago

We rebuilt the whole PGO system in .NET 6. I have been thinking about how we can take advantage of real apps going forward. Clearly, Paint.NET being on .NET 5+ is a big win for that.

In the future, I'd like to establish a model where we can change PGO data as part of a servicing update. We'd need a compelling and coherent plan for that (which I don't have), but that's the direction I have in mind.

@davidwrighton

davidwrighton commented 3 years ago

I believe we actually already have the means for updating PGO data in a servicing update, but we don't actually have any plans to do so.

richlander commented 3 years ago

Cool. I assumed we had no plans for that. It's something that we should consider and discuss, particularly for a release that is supported for three years.

JulieLeeMSFT commented 3 years ago

@rickbrew I am trying to understand when the milestone should be for this issue. How critical is it to include this in .NET6? Are you planning to complete this in .NET6?

danmoseley commented 3 years ago

@JulieLeeMSFT Seems to me we just need to merge https://github.com/dotnet/runtime/pull/56501 for .NET 6.0. Longer term we should increase our PGO test app matrix.

rickbrew commented 3 years ago

@rickbrew I am trying to understand when the milestone should be for this issue. How critical is it to include this in .NET6? Are you planning to complete this in .NET6?

This isn't critical for me -- I made my own InlinedCompareTo method that I'm using for now inside my app.

JulieLeeMSFT commented 3 years ago

This isn't critical for me -- I made my own InlinedCompareTo method that I'm using for now inside my app.

Thanks @rickbrew. I set the milestone to .NET 7. @tannergooding I assigned this issue to you to review the PR and merge it.

EgorBo commented 2 years ago

Closed by https://github.com/dotnet/runtime/pull/56501