lahma commented 1 year ago

I was checking how much faster it is to use span instead of indexing (Jint uses indexing at the moment) and I though that it shouldn't have that much difference as it had so tweaked code a bit.

better indexer with less IL - it's enough to have nameof(something) to cause ldstring, more fine-grained ThrowHelper to the rescue
encapsulate EnsureCapacity as single method to resize
remove TrimExcess as I didn't see much point in it performance wise

Results

Modified benchmark (no boxing for result) was used against both main and this PR. The couple byte allocations in 0 size test case ForLoopBenchmark is quite curious, but didn't find out the reason. Now indexing is as fast as spans or even faster for small lists (0-1), spans start to shine when iterating more items.

💡 We maybe should consider converting visitors to use foreach + span.


BenchmarkDotNet v0.13.7, Windows 11 (10.0.23516.1000)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 7.0.400
  [Host]     : .NET 6.0.21 (6.0.2123.36311), X64 RyuJIT AVX2
  DefaultJob : .NET 6.0.21 (6.0.2123.36311), X64 RyuJIT AVX2

Esprima.Benchmark.AstTreeWalkBenchmark

Diff	Method	Mean	Error	Allocated
Old	VisitChildren	2.565 ms	0.0151 ms	35 B
New		2.572 ms (0%)	0.0140 ms	35 B (0%)

Esprima.Benchmark.FileParsingBenchmark

Diff	Method	FileName	Mean	Error	Allocated
Old	ParseProgram	angular-1.2.5	10.231 ms	0.0621 ms	4095.44 KB
New			10.238 ms (0%)	0.0449 ms	4095.76 KB (0%)
Old	ParseProgram	backbone-1.1.0	1.348 ms	0.0018 ms	647.77 KB
New			1.329 ms (-1%)	0.0083 ms	647.8 KB (0%)
Old	ParseProgram	jquery-1.9.1	7.932 ms	0.0374 ms	3540.14 KB
New			7.997 ms (+1%)	0.0368 ms	3541.17 KB (0%)
Old	ParseProgram	jquery.mobile-1.4.2	12.138 ms	0.0350 ms	5609.22 KB
New			12.053 ms (-1%)	0.0513 ms	5609.56 KB (0%)
Old	ParseProgram	mootools-1.4.5	6.363 ms	0.0213 ms	2938.37 KB
New			6.377 ms (0%)	0.0284 ms	2938.73 KB (0%)
Old	ParseProgram	underscore-1.5.2	1.156 ms	0.0048 ms	555.5 KB
New			1.139 ms (-1%)	0.0046 ms	555.45 KB (0%)
Old	ParseProgram	yui-3.12.0	5.439 ms	0.0223 ms	2714.65 KB
New			5.476 ms (+1%)	0.0237 ms	2714.93 KB (0%)

Esprima.Benchmark.NodeListEnumerationBenchmark

Diff	Method	NodeListIndex	Mean	Error	Allocated
Old	For_DirectIndexing	0	5,633,143.177 ns	25,214.8469 ns	7 B
New		0	2,318,609.390 ns (-59%)	11,763.7113 ns	3 B (-57%)
Old	For_SpanIndexing	0	2,629,088.965 ns	3,468.6020 ns	3 B
New		0	2,619,675.417 ns (0%)	8,738.2733 ns	4 B (+33%)
Old	ForEach_Span	0	2,630,783.750 ns	7,398.6263 ns	3 B
New		0	2,698,511.576 ns (+3%)	50,837.2970 ns	3 B (0%)
Old	For_DirectIndexing	1	1,782.627 ns	1.1875 ns	0 B
New		1 (0%)	685.421 ns (-62%)	2.2029 ns	0 B
Old	For_SpanIndexing	1	669.814 ns	3.1510 ns	0 B
New		1 (0%)	673.555 ns (+1%)	0.1979 ns	0 B
Old	ForEach_Span	1	674.983 ns	0.6107 ns	0 B
New		1 (0%)	674.960 ns (0%)	4.5344 ns	0 B
Old	For_DirectIndexing	2	20.300 ns	0.1101 ns	0 B
New		2 (0%)	5.302 ns (-74%)	0.0018 ns	0 B
Old	For_SpanIndexing	2	4.746 ns	0.0197 ns	0 B
New		2 (0%)	4.704 ns (-1%)	0.0063 ns	0 B
Old	ForEach_Span	2	4.799 ns	0.0218 ns	0 B
New		2 (0%)	4.735 ns (-1%)	0.0129 ns	0 B

Esprima.Benchmark.VisitorBenchmark

Diff	Method	Mean	Error	Allocated
Old	Visit	11.54 ms	0.225 ms	14 B
New		10.49 ms (-9%)	0.103 ms	14 B (0%)
Old	Rewrite	23.60 ms	0.467 ms	28 B
New		22.46 ms (-5%)	0.157 ms	28 B (0%)

adams85 commented 1 year ago

Now indexing is as fast as spans or even faster for small lists (0-1), spans start to shine when iterating more items.

💡 We maybe should consider converting visitors to use foreach + span.

Sounds like it's worth a try. Thanks to our source generation magic, probably it wouldn't be too tedious. So, I'm looking forward to the new benchmark results! ;)

lahma commented 1 year ago

OK, I've reverted most of the PR and should now only contain the gist of indexing improvements. main vs this branch:

Esprima.Benchmark.NodeListEnumerationBenchmark

Diff	Method	Toolchain	Mean	Error	Allocated
Old	For_DirectIndexing	Default	5,839,201.953 ns	15,907.3791 ns	7 B
New		Default	2,683,663.854 ns (-54%)	6,805.3141 ns	3 B (-57%)
Old	For_SpanIndexing	Default	2,629,126.875 ns	9,483.2562 ns	4 B
New		Default	2,636,007.338 ns (0%)	3,330.0859 ns	4 B (0%)
Old	ForEach_Span	Default	2,634,116.172 ns	19,719.2877 ns	4 B
New		Default	2,626,229.010 ns (0%)	8,729.7299 ns	3 B (-25%)
Old	For_DirectIndexing	Default	1,776.865 ns	0.1620 ns	0 B
New		Default	682.067 ns (-62%)	0.7556 ns	0 B
Old	For_SpanIndexing	Default	683.543 ns	1.5607 ns	0 B
New		Default	672.859 ns (-2%)	0.8225 ns	0 B
Old	ForEach_Span	Default	686.167 ns	4.9590 ns	0 B
New		Default	666.897 ns (-3%)	0.5036 ns	0 B
Old	For_DirectIndexing	Default	18.818 ns	0.1431 ns	0 B
New		Default	4.997 ns (-73%)	0.0213 ns	0 B
Old	For_SpanIndexing	Default	4.758 ns	0.0104 ns	0 B
New		Default	4.390 ns (-8%)	0.0057 ns	0 B
Old	ForEach_Span	Default	4.756 ns	0.0056 ns	0 B
New		Default	4.405 ns (-7%)	0.0071 ns	0 B

adams85 commented 1 year ago

FWIW, ReadOnlySpan<T> ctor does less checks than Span<T> ctor.

I assume that spans' worse performance for smaller lists is caused by the initial cost of constructing a span. So, for read-only iteration ReadOnlySpan<T> is a tiny bit better.

lahma commented 1 year ago

Thanks for helping out and getting this finalized!

sebastienros / esprima-dotnet

Improve ArrayList and NodeList #396

Results

Esprima.Benchmark.AstTreeWalkBenchmark

Esprima.Benchmark.FileParsingBenchmark

Esprima.Benchmark.NodeListEnumerationBenchmark

Esprima.Benchmark.VisitorBenchmark

Esprima.Benchmark.NodeListEnumerationBenchmark