Open performanceautofiler[bot] opened 2 years ago
Suspect: https://github.com/dotnet/runtime/pull/62689 cc @pedrobsaila
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.
Author: | performanceautofiler[bot] |
---|---|
Assignees: | - |
Labels: | `tenet-performance`, `tenet-performance-benchmarks`, `area-CodeGen-coreclr`, `untriaged` |
Milestone: | - |
My PR generated some regressions on SciMark2 :
My change is probably the one causing the regression, let me see if I can fix them. Just one question about these comments on SciMark2 :
/// This software is likely to burn your processor, bitflip your memory chips
/// anihilate your screen and corrupt all your disks, so you it at your
/// own risk.
Do I really risk something if I debug these tests locally ?
Do I really risk something if I debug these tests locally ?
We run them on daily basis so hopefully not? 🙂 No idea who put that comment there
Do I really risk something if I debug these tests locally ?
We run them on daily basis so hopefully not? 🙂 No idea who put that comment there
I think it's a joke -- best I can tell it was added when the benchmark code was ported from Java to C#, many years ago.
Let me know if you need any help digging into this. The issues here might be similar to the problems we have with if conversion. For example, by changing something like
if ((x op const) && (y op const)) { ... }
to
if ((x op y) op const) { .... }
we run the risk that if (x op const)
is usually false
and y
is the result of some long-latency computation or dependence chain then evaluating y
eagerly can make things slower (and similarly when the combiner is ||
and (x op const)
is usually true
).
One way to avoid the potential downside is to not do this optimization when the predicates are in a loop, or when profile data indicates evaluation of y
is unlikely.
Let me know if you need any help digging into this
I ran the benchmark on x64/x86 Windows/Linux and I could not reproduce the 100 ms regression : I have light regressions/improvements (less than 30 ms) depending on the environment. It would be of help to see assembly diffs if you have an arm64 machine just to be sure there's no bad assembly generated
we run the risk that if (x op const) is usually false and y is the result of some long-latency computation or dependence chain then evaluating y eagerly can make things slower (and similarly when the combiner is || and (x op const) is usually true).
Thanks for the hint. I'll see if I can improve regressions based on this. I see something similar in
I've been playing for a while with SciMark2.kernel perf tests and I don't think that regression is caused by the fact that we might evaluate an expensive if ((x op y) op const) { .... }
where x is false and y is an expensive operation. The optimisation is applied only to array range checks indexer >= 0 so the operation is not expensive and is always true (no IndexOutOfRangeException thrown in code)
Here are assembly diffs in Windows X64 (same diffs for Linux x64) for the main methods of SciMark2.kernel :
Did not find any suspicious diffs in assembly. The diffs are not that significant to generate important improvements/regressions. @AndyAyersMS I think I need a little help to dig this.
It seems like things are back the way they were :
Likely from #77728
Run Information
Regressions in SciMark2.kernel
Test Report
Repro
Related Issues
Regressions
Improvements