sebastienros / esprima-dotnet

Esprima .NET (BSD license) is a .NET port of the esprima.org project. It is a standard-compliant ECMAScript parser (also popularly known as JavaScript).
BSD 3-Clause "New" or "Revised" License
425 stars 75 forks source link

Improve ArrayList and NodeList #396

Closed lahma closed 1 year ago

lahma commented 1 year ago

I was checking how much faster it is to use span instead of indexing (Jint uses indexing at the moment) and I though that it shouldn't have that much difference as it had so tweaked code a bit.

Results

Modified benchmark (no boxing for result) was used against both main and this PR. The couple byte allocations in 0 size test case ForLoopBenchmark is quite curious, but didn't find out the reason. Now indexing is as fast as spans or even faster for small lists (0-1), spans start to shine when iterating more items.

💡 We maybe should consider converting visitors to use foreach + span.


BenchmarkDotNet v0.13.7, Windows 11 (10.0.23516.1000)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 7.0.400
  [Host]     : .NET 6.0.21 (6.0.2123.36311), X64 RyuJIT AVX2
  DefaultJob : .NET 6.0.21 (6.0.2123.36311), X64 RyuJIT AVX2

Esprima.Benchmark.AstTreeWalkBenchmark

Diff Method Mean Error Allocated
Old VisitChildren 2.565 ms 0.0151 ms 35 B
New 2.572 ms (0%) 0.0140 ms 35 B (0%)

Esprima.Benchmark.FileParsingBenchmark

Diff Method FileName Mean Error Allocated
Old ParseProgram angular-1.2.5 10.231 ms 0.0621 ms 4095.44 KB
New 10.238 ms (0%) 0.0449 ms 4095.76 KB (0%)
Old ParseProgram backbone-1.1.0 1.348 ms 0.0018 ms 647.77 KB
New 1.329 ms (-1%) 0.0083 ms 647.8 KB (0%)
Old ParseProgram jquery-1.9.1 7.932 ms 0.0374 ms 3540.14 KB
New 7.997 ms (+1%) 0.0368 ms 3541.17 KB (0%)
Old ParseProgram jquery.mobile-1.4.2 12.138 ms 0.0350 ms 5609.22 KB
New 12.053 ms (-1%) 0.0513 ms 5609.56 KB (0%)
Old ParseProgram mootools-1.4.5 6.363 ms 0.0213 ms 2938.37 KB
New 6.377 ms (0%) 0.0284 ms 2938.73 KB (0%)
Old ParseProgram underscore-1.5.2 1.156 ms 0.0048 ms 555.5 KB
New 1.139 ms (-1%) 0.0046 ms 555.45 KB (0%)
Old ParseProgram yui-3.12.0 5.439 ms 0.0223 ms 2714.65 KB
New 5.476 ms (+1%) 0.0237 ms 2714.93 KB (0%)

Esprima.Benchmark.NodeListEnumerationBenchmark

Diff Method NodeListIndex Mean Error Allocated
Old For_DirectIndexing 0 5,633,143.177 ns 25,214.8469 ns 7 B
New 0 2,318,609.390 ns (-59%) 11,763.7113 ns 3 B (-57%)
Old For_SpanIndexing 0 2,629,088.965 ns 3,468.6020 ns 3 B
New 0 2,619,675.417 ns (0%) 8,738.2733 ns 4 B (+33%)
Old ForEach_Span 0 2,630,783.750 ns 7,398.6263 ns 3 B
New 0 2,698,511.576 ns (+3%) 50,837.2970 ns 3 B (0%)
Old For_DirectIndexing 1 1,782.627 ns 1.1875 ns 0 B
New 1 (0%) 685.421 ns (-62%) 2.2029 ns 0 B
Old For_SpanIndexing 1 669.814 ns 3.1510 ns 0 B
New 1 (0%) 673.555 ns (+1%) 0.1979 ns 0 B
Old ForEach_Span 1 674.983 ns 0.6107 ns 0 B
New 1 (0%) 674.960 ns (0%) 4.5344 ns 0 B
Old For_DirectIndexing 2 20.300 ns 0.1101 ns 0 B
New 2 (0%) 5.302 ns (-74%) 0.0018 ns 0 B
Old For_SpanIndexing 2 4.746 ns 0.0197 ns 0 B
New 2 (0%) 4.704 ns (-1%) 0.0063 ns 0 B
Old ForEach_Span 2 4.799 ns 0.0218 ns 0 B
New 2 (0%) 4.735 ns (-1%) 0.0129 ns 0 B

Esprima.Benchmark.VisitorBenchmark

Diff Method Mean Error Allocated
Old Visit 11.54 ms 0.225 ms 14 B
New 10.49 ms (-9%) 0.103 ms 14 B (0%)
Old Rewrite 23.60 ms 0.467 ms 28 B
New 22.46 ms (-5%) 0.157 ms 28 B (0%)
adams85 commented 1 year ago

Now indexing is as fast as spans or even faster for small lists (0-1), spans start to shine when iterating more items.

💡 We maybe should consider converting visitors to use foreach + span.

Sounds like it's worth a try. Thanks to our source generation magic, probably it wouldn't be too tedious. So, I'm looking forward to the new benchmark results! ;)

lahma commented 1 year ago

OK, I've reverted most of the PR and should now only contain the gist of indexing improvements. main vs this branch:

Esprima.Benchmark.NodeListEnumerationBenchmark

Diff Method Toolchain Mean Error Allocated
Old For_DirectIndexing Default 5,839,201.953 ns 15,907.3791 ns 7 B
New Default 2,683,663.854 ns (-54%) 6,805.3141 ns 3 B (-57%)
Old For_SpanIndexing Default 2,629,126.875 ns 9,483.2562 ns 4 B
New Default 2,636,007.338 ns (0%) 3,330.0859 ns 4 B (0%)
Old ForEach_Span Default 2,634,116.172 ns 19,719.2877 ns 4 B
New Default 2,626,229.010 ns (0%) 8,729.7299 ns 3 B (-25%)
Old For_DirectIndexing Default 1,776.865 ns 0.1620 ns 0 B
New Default 682.067 ns (-62%) 0.7556 ns 0 B
Old For_SpanIndexing Default 683.543 ns 1.5607 ns 0 B
New Default 672.859 ns (-2%) 0.8225 ns 0 B
Old ForEach_Span Default 686.167 ns 4.9590 ns 0 B
New Default 666.897 ns (-3%) 0.5036 ns 0 B
Old For_DirectIndexing Default 18.818 ns 0.1431 ns 0 B
New Default 4.997 ns (-73%) 0.0213 ns 0 B
Old For_SpanIndexing Default 4.758 ns 0.0104 ns 0 B
New Default 4.390 ns (-8%) 0.0057 ns 0 B
Old ForEach_Span Default 4.756 ns 0.0056 ns 0 B
New Default 4.405 ns (-7%) 0.0071 ns 0 B
adams85 commented 1 year ago

FWIW, ReadOnlySpan<T> ctor does less checks than Span<T> ctor.

I assume that spans' worse performance for smaller lists is caused by the initial cost of constructing a span. So, for read-only iteration ReadOnlySpan<T> is a tiny bit better.

lahma commented 1 year ago

Thanks for helping out and getting this finalized!