sklose / NCalc2

expression evaluator for .NET with built-in compiler
MIT License
166 stars 58 forks source link

Performance after upgrade to antlr4 #78

Closed Bykiev closed 8 months ago

Bykiev commented 8 months ago

Hi, there is a performance hit after upgrade to Antlr 4. You can compare performance tests on AppVeyor:

2.2.111: https://ci.appveyor.com/project/sklose/ncalc2/builds/46365044/tests 3.0.119: https://ci.appveyor.com/project/sklose/ncalc2/build/tests

test 3.0.119 2.2.111
NCalc.Tests.Performance.ParameterAccess(formula: "[Param1] * 7 + [Param2]") 176 ms 95 ms
NCalc.Tests.Performance.Arithmetics(formula: "(4 12 / 7) + ((9 2) % 8)") 424 ms 102 ms
NCalc.Tests.Performance.DynamicParameterAccess(formula: "[Param1] * 7 + [Param2]") 156 ms 89 ms
NCalc.Tests.Performance.Arithmetics(formula: "5 2 = 2 5 && (1 / 3.0) * 3 = 1") 805 ms 379 ms
NCalc.Tests.Performance.FunctionWithDynamicParameterAccess(formula: "Foo([Param1] * 7, [Param2])") 265 ms 87 ms

@psiservices-dbrink, is any ideas how to improve the performance?

david-brink-talogy commented 8 months ago

I only made minor edits to the grammar to do the antlr4 port. Based on some quick research antlr4's lexer/parser is more powerful, but can be slower on some grammars.

Some possibilities to consider:

Try SLL prediction with potential fallback https://stackoverflow.com/questions/29027464/c-sharp-antlr4-maximize-speed

Disable tracing https://github.com/ncalc/ncalc/issues/51

Bykiev commented 8 months ago

Disable tracing ncalc/ncalc#51

Tracing was already disabled in https://github.com/sklose/NCalc2/pull/70

david-brink-talogy commented 8 months ago

Potentially related https://github.com/ncalc/ncalc/issues/81

Bykiev commented 8 months ago

Potentially related ncalc/ncalc#81

I don't think so, based on release notes of ANTLR, there is no performance-related changes in C# targets in the latest version, since NCalc2 depends on ANTLR 4.13.0

david-brink-talogy commented 8 months ago

Potentially related ncalc/ncalc#81

I don't think so, based on release notes of ANTLR, there is no performance-related changes in C# targets in the latest version, since NCalc2 depends on ANTLR 4.13.0

The reporter of the issue indicates that version 3.6.0 is faster than newer versions of NCalcSync on Xamarin. If you look at the dependencies you'll see that 3.7.0 is the first version that depends on ANTLR4. Perhaps the issue is really unique to Xamarin, but it may be broader. Reiterating what I wrote on my initial reply, it could be that ANTLR4 is just slower than ANTLR3 on the ncalc grammar.

dependencies https://www.nuget.org/packages/NCalcSync/3.6.0#dependencies-body-tab https://www.nuget.org/packages/NCalcSync/3.7.0#dependencies-body-tab

Here's an article on improving ANTLR4 performance, but it would take some serious time to investigate. I think the SLL change could yield some quick gains if you're interested in opening a PR.
https://tomassetti.me/improving-the-performance-of-an-antlr-parser/

david-brink-talogy commented 8 months ago

One other thing to note is that I also changed the appveyor build image. If their VS 2022 image has fewer resources the metrics could be worse for reasons unrelated to the code. Are you also observing the performance decrease in a like for like environment?

Bykiev commented 8 months ago

Potentially related ncalc/ncalc#81

I don't think so, based on release notes of ANTLR, there is no performance-related changes in C# targets in the latest version, since NCalc2 depends on ANTLR 4.13.0

The reporter of the issue indicates that version 3.6.0 is faster than newer versions of NCalcSync on Xamarin. If you look at the dependencies you'll see that 3.7.0 is the first version that depends on ANTLR4. Perhaps the issue is really unique to Xamarin, but it may be broader. Reiterating what I wrote on my initial reply, it could be that ANTLR4 is just slower than ANTLR3 on the ncalc grammar.

dependencies https://www.nuget.org/packages/NCalcSync/3.6.0#dependencies-body-tab https://www.nuget.org/packages/NCalcSync/3.7.0#dependencies-body-tab

You're right, Antlr was updated from 4.12 to 4.13.1 in NCalc in the latest version and it potentially could be faster after fixing this bug. But NCalc2 already depends on Antlr >=4.13.0.

Indeed, it seems the NCalc grammar is not efficient for Antlr 4, but I have no experience with it, so I can't recommend anything to improve the grammar. I see you have some experience with it, maybe you can see what can be improved?

Bykiev commented 8 months ago

One other thing to note is that I also changed the appveyor build image. If their VS 2022 image has fewer resources the metrics could be worse for reasons unrelated to the code. Are you also observing the performance decrease in a like for like environment?

I did some tests on my laptop and the results are even worse. We can also measure results with Benchmark.NET on latest builds with Antlr3 and Antlr4.

Bykiev commented 8 months ago

I've created the test with BenchmarkDotNet and I was totally wrong...

Here is my benchmark class:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Toolchains.InProcess.NoEmit;
using NCalc;

namespace ConsoleApp14
{
    [Config(typeof(Config))]
    public class CalculationTest
    {
        private class Config : ManualConfig
        {
            public Config()
            {
                var baseJob = Job.Default;

                AddJob(baseJob.WithNuGet("CoreCLR-NCalc", "3.0.119")
                    .WithId("3.0.119")
                    .WithToolchain(InProcessNoEmitToolchain.Instance));
                AddJob(baseJob.WithNuGet("CoreCLR-NCalc", "2.2.113").WithId("2.2.113")
                    .WithBaseline(true)
                    .WithToolchain(InProcessNoEmitToolchain.Instance));
            }
        }

        [Params("(4 * 12 / 7) + ((9 * 2) % 8)", "5 * 2 = 2 * 5 && (1 / 3.0) * 3 = 1")]
        public string Formula { get; set; }

        [Benchmark]
        public void TestFormulaEvaluation()
        {
            var expression = new Expression(Formula);
            var lambda = expression.ToLambda<object>();

            expression.Evaluate();
            lambda();
        }
    }
}

BenchmarkDotNet v0.13.11, Windows 10 (10.0.19045.3803/22H2/2022Update)
Intel Core i3-6100U CPU 2.30GHz (Skylake), 1 CPU, 4 logical and 2 physical cores
.NET SDK 7.0.306
  [Host] : .NET 6.0.20 (6.0.2023.32017), X64 RyuJIT AVX2

Toolchain=InProcessNoEmitToolchain  
Method Job NuGetReferences Formula Mean Error StdDev Ratio RatioSD
TestFormulaEvaluation 2.2.113 CoreCLR-NCalc 2.2.113 *(4 (...) % 8) [28]** 101.66 μs 1.868 μs 1.458 μs 1.00 0.00
TestFormulaEvaluation 3.0.119 CoreCLR-NCalc 3.0.119 (4 * (...) % 8) [28] 98.17 μs 1.850 μs 2.056 μs 0.96 0.03
TestFormulaEvaluation 2.2.113 CoreCLR-NCalc 2.2.113 *5 2(...)3 = 1 [34]** 113.32 μs 2.025 μs 2.332 μs 1.00 0.00
TestFormulaEvaluation 3.0.119 CoreCLR-NCalc 3.0.119 5 * 2(...)3 = 1 [34] 113.30 μs 2.244 μs 2.917 μs 1.00 0.03

As you can see, the result is quite similar in both versions. Sorry for my fault. I'll close the issue.