dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.12k stars 4.7k forks source link

Optimize vector operations in mono interpreter #34751

Closed BrzVlad closed 1 year ago

BrzVlad commented 4 years ago

On the set of microbenchmarks from dotnet/performance, the interpreter behaves by far the slowest on vector related benchmarks. This slowness has repercussions also on other parts of the bcl, mainly Span. Initial improvement might come once https://github.com/dotnet/runtime/issues/34750 is addressed.

Slowest microbenchmarks on interpreter, compared to mono jit, all seem to be vector related :

Benchmark Name results-jit results-interp
SIMD.ConsoleMandel.VectorFloatSinglethreadRaw 1 850.62
SIMD.ConsoleMandel.VectorDoubleSinglethreadRaw 1 692.85
Burgers.Test3 1 484.42
BenchmarksGame.MandelBrot_7.Bench(size: 4000; lineLength: 500; checksum: "C7-E6-66-43-66-73-F8-A8-D3... 1 303.55
SIMD.ConsoleMandel.VectorFloatSinglethreadADT 1 197.83
SIMD.ConsoleMandel.VectorDoubleSinglethreadADT 1 156.69
BilinearTest.Interpol_Vector 1 99.04
ghost commented 4 years ago

Tagging @brzvlad, @lewurm as an area owner

SamMonoRT commented 2 years ago

@BrzVlad - do we need this open anymore ?

BrzVlad commented 2 years ago

The current performance that I'm getting on one of these tests seems to be about 12 times better than at the time of reporting. However, this is still very slow compared to JIT. Looking at the generated code, we should be able to greatly improve performance if we implement loop unrolling coupled with removal of some local var indirections.

BrzVlad commented 1 year ago

Addressed in https://github.com/dotnet/runtime/pull/86859