dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.67k stars 4.58k forks source link

[PERF] Windows x86 Regression in BilinearTest.Interpol_AVX #92161

Open DrewScoggins opened 10 months ago

DrewScoggins commented 10 months ago

This was found during the manual 7.0 -> 8.0 RC1 comparison. Linked below is a link to the test history.

https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_x86_Windows%2010.0.18362%2fBilinearTest.Interpol_AVX.html

The suspected commit affecting this is, https://github.com/dotnet/runtime/pull/84384.

ghost commented 10 months ago

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

Issue Details
This was found during the manual 7.0 -> 8.0 RC1 comparison. Linked below is a link to the test history. https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_x86_Windows%2010.0.18362%2fBilinearTest.Interpol_AVX.html The suspected commit affecting this is, https://github.com/dotnet/runtime/pull/84384.
Author: DrewScoggins
Assignees: -
Labels: `tenet-performance`, `tenet-performance-benchmarks`, `area-CodeGen-coreclr`, `untriaged`, `needs-area-label`
Milestone: -
BruceForstall commented 10 months ago

@EgorBo Can you take a look?

cc @tannergooding @khushal1996

khushal1996 commented 10 months ago

cc @anthonycanino @DeepakRajendrakumaran

EgorBo commented 10 months ago

I don't see any codegen diffs for the given commit range locally

EgorBo commented 10 months ago
Method Job Toolchain Mean Error StdDev Ratio Code Size
Interpol_AVX Job-HJGTCR \Core_Root\corerun.exe 3.959 us 0.6729 us 0.0369 us 1.01 906 B
Interpol_AVX Job-DVKPPT \Core_Root_base\corerun.exe 3.913 us 1.0288 us 0.0564 us 1.00 906 B

no perf or asm diffs between these commits on AMD 7950X + x86. Tried to enable/disable avx512, still no diffs. Could be some JCC erratum/intel specific? Moving to 9.0 since x86 has lower priority