Closed Bykiev closed 9 months ago
@david-brink-talogy, comparing with ToUpper()/ToLower() is not the best choice as it'll be not working with some cultures. For ex.:
Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR");
const string input = "interesting";
bool comparison = input.ToUpper() == "INTERESTING";
Console.WriteLine("These things are equal: " + comparison);
@david-brink-talogy, I did some more research and found that in switch block comparison occurs with String.Equals(string, string) which is compiled into:
String.Equals(string1, string2, StringComparison.Ordinal);
,
so we can convert the function name to upper case with ToUpperInvariant()
and it could be faster. I'll prepare some tests with BenchmarkDotNet later to check it .
I did the benchmarking, here is results:
public class TestComparision
{
string functionName = "in";
[Benchmark]
public void TestWithToUpperInvariant()
{
string upperFuncName = functionName.ToUpperInvariant();
switch (upperFuncName)
{
case "ABS": { break; }
case "ACOS": { break; }
case "ASIN": { break; }
case "ATAN": { break; }
case "CEILING": { break; }
case "COS": { break; }
case "EXP": { break; }
case "FLOOR": { break; }
case "IEEEREMAINDER": { break; }
case "LOG": { break; }
case "LOG10": { break; }
case "POW": { break; }
case "ROUND": { break; }
case "SIGN": { break; }
case "SIN": { break; }
case "SQRT": { break; }
case "TAN": { break; }
case "TRUNCATE": { break; }
case "MAX": { break; }
case "MIN": { break; }
case "IF": { break; }
case "IN": { break; }
}
}
[Benchmark(Baseline =true)]
public void TestWithOrdinalIgnoreCase()
{
switch (functionName)
{
case string n when n.Equals("abs", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("acos", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("asin", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("atan", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("ceiling", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("cos", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("exp", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("floor", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("ieeeremainder", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("log", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("log10", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("pow", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("round", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("sign", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("sin", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("sqrt", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("tan", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("truncate", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("max", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("min", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("if", StringComparison.OrdinalIgnoreCase): { break; }
case string n when n.Equals("in", StringComparison.OrdinalIgnoreCase): { break; }
}
}
}
BenchmarkDotNet v0.13.12, Windows 10 (10.0.19044.2965/21H2/November2021Update) Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores .NET SDK 8.0.101 [Host] : .NET 6.0.26 (6.0.2623.60508), X64 RyuJIT AVX2 DefaultJob : .NET 6.0.26 (6.0.2623.60508), X64 RyuJIT AVX2
Method | Mean | Error | StdDev | Median | Ratio | RatioSD |
---|---|---|---|---|---|---|
TestWithToUpperInvariant | 22.96 ns | 0.491 ns | 1.408 ns | 22.46 ns | 0.44 | 0.03 |
TestWithOrdinalIgnoreCase | 52.83 ns | 1.014 ns | 1.318 ns | 52.60 ns | 1.00 | 0.00 |
I still can't understand why does ToUpperInvariant() + Ordinal is faster than OrdinalIgnoreCase. Any ideas?
Upd. Well, it's because of with OrdinalIgnoreCase the case is checked on each iteration and with Ordinal comparer only the first time, with abs
(which is first) the results are different:
Method | Mean | Error | StdDev | Ratio | RatioSD |
---|---|---|---|---|---|
TestWithToUpperInvariant | 21.927 ns | 0.1647 ns | 0.1286 ns | 8.08 | 0.24 |
TestWithOrdinalIgnoreCase | 2.702 ns | 0.0580 ns | 0.0754 ns | 1.00 | 0.00 |
Don't know exactly does it makes any sense to change the existing logic? We can also change the order of switch case to make most common functions on top, but what is the most common is differs...
So, with ToUpperInvariant()
the results will be similar and the position of function in switch/case does low impact on overall. With OrdinalIgnoreCase
the results will be vary in the begin and in the end, because case checks will be made on each iteration.
@sklose, what do you think about it?
I believe, we should change the order of case blocks to alphabetical, for some reason if, in, min, max
has wrong order
comparing with ToUpper()/ToLower() is not the best choice as it'll be not working with some cultures. For ex.:
You're correct, good callout.
Well, it's because of with OrdinalIgnoreCase the case is checked on each iteration and with Ordinal comparer only the first time, with abs (which is first) the results are different:
Yes, this is what I meant when I wrote "especially for the later functions.". The ncalc functions checked last have to be pattern matched against all the prior patterns first. A switch on static values should perform better since the compiler can more efficiently branch to the correct case. The normalization to invariant uppercase also happens just once, making each string comparison marginally faster.
Don't know exactly does it makes any sense to change the existing logic?
Based on your tests function name resolution is about twice as fast on average for "in". But obviously, we're talking about very small amounts of time here. Would you be willing to benchmark a few function calls as a single test (maybe ceiling, round, and if) to simulate a single expression with multiple functions that are spread out across the switch.
for some reason if, in, min, max has wrong order
These were probably just added last and were appended to the existing switch by the original developer.
@david-brink-talogy, here is a test results with ceiling
, round
, and if
functions:
Method | Mean | Error | StdDev | Ratio | RatioSD |
---|---|---|---|---|---|
TestWithToUpperInvariant | 65.46 ns | 0.973 ns | 1.265 ns | 0.75 | 0.02 |
TestWithOrdinalIgnoreCase | 87.52 ns | 1.337 ns | 1.185 ns | 1.00 | 0.00 |
@Bykiev , Thanks. If you feel comfortable making the change it does seem like there is a benefit.
@david-brink-talogy, thank you for discussion, I'll create a PR with these changes, also I'll change the order of function names in case block to follow alphabetical order. Thank you!
thanks for the research and benchmarking! Could it make sense to turn this into a dictionary lookup? That way the order of the functions wouldn't matter for performance purposes.
thanks for the research and benchmarking! Could it make sense to turn this into a dictionary lookup? That way the order of the functions wouldn't matter for performance purposes.
Sorry, didn't get the idea, can you please share you thoughts?
thanks for the research and benchmarking! Could it make sense to turn this into a dictionary lookup? That way the order of the functions wouldn't matter for performance purposes.
@sklose, I think a switch on static values should be very performant. The c# compiler should be able to create a jump table to find the appropriate match rather than needing to test every possible case in sequence.
@Bykiev, I think the dictionary implementation would have to have a callback (Func) defined for the implementation of each function. I did need to make CheckCase static to do this.
private static readonly Dictionary<string, Func<EvaluationVisitor, Function, object>> StandardFunctions = new Dictionary<string, Func<EvaluationVisitor, Function, object>>() {
{ "ABS", (visitor, function) => {
CheckCase("Abs", function.Identifier.Name);
if (function.Expressions.Length != 1)
throw new ArgumentException("Abs() takes exactly 1 argument");
bool useDouble = (visitor._options & EvaluateOptions.UseDoubleForAbsFunction) == EvaluateOptions.UseDoubleForAbsFunction;
if (useDouble)
{
return Math.Abs(Convert.ToDouble(
visitor.Evaluate(function.Expressions[0]))
);
}
else
{
return Math.Abs(Convert.ToDecimal(
visitor.Evaluate(function.Expressions[0]))
);
}
}
},
{ "ACOS", (visitor, function) => {
CheckCase("Acos", function.Identifier.Name);
if (function.Expressions.Length != 1)
throw new ArgumentException("Acos() takes exactly 1 argument");
return Math.Acos(Convert.ToDouble(visitor.Evaluate(function.Expressions[0])));
}
},
// continue...
};
// in place of the current switch
if (!StandardFunctions.TryGetValue(function.Identifier.Name.ToUpper(), out var functionImplementation))
{
throw new ArgumentException("Function not found", function.Identifier.Name);
}
Result = functionImplementation(this, function);
@david-brink-talogy, I did some benchmarks on my laptop and result is twice worse with static dictionary. Test case:
public class TestComparision
{
string[] functions = new string[] { "ceiling", "round", "if", "round", "if", "ceiling" };
private static readonly Dictionary<string, Func<EvaluationVisitor, Function, object?>> StandardFunctions = new() {
{ "ABS", (visitor, function) => { return null; } },
{ "ACOS", (visitor, function) => { return null; } },
{ "ASIN", (visitor, function) => { return null; } },
{ "ATAN", (visitor, function) => { return null; } },
{ "CEILING", (visitor, function) => { return null; } },
{ "COS", (visitor, function) => { return null; } },
{ "EXP", (visitor, function) => { return null; } },
{ "FLOOR", (visitor, function) => { return null; } },
{ "IEEEREMAINDER", (visitor, function) => { return null; } },
{ "LOG", (visitor, function) => { return null; } },
{ "LOG10", (visitor, function) => { return null; } },
{ "POW", (visitor, function) => { return null; } },
{ "ROUND", (visitor, function) => { return null; } },
{ "SIGN", (visitor, function) => { return null; } },
{ "SIN", (visitor, function) => { return null; } },
{ "SQRT", (visitor, function) => { return null; } },
{ "TAN", (visitor, function) => { return null; } },
{ "TRUNCATE", (visitor, function) => { return null; } },
{ "MAX", (visitor, function) => { return null; } },
{ "MIN", (visitor, function) => { return null; } },
{ "IF", (visitor, function) => { return null; } },
{ "IN", (visitor, function) => { return null; } }
};
[Benchmark]
public void TestWithToUpperInvariant()
{
foreach (var functionName in functions)
{
string upperFuncName = functionName.ToUpperInvariant();
switch (upperFuncName)
{
case "ABS": { break; }
case "ACOS": { break; }
case "ASIN": { break; }
case "ATAN": { break; }
case "CEILING": { break; }
case "COS": { break; }
case "EXP": { break; }
case "FLOOR": { break; }
case "IEEEREMAINDER": { break; }
case "LOG": { break; }
case "LOG10": { break; }
case "POW": { break; }
case "ROUND": { break; }
case "SIGN": { break; }
case "SIN": { break; }
case "SQRT": { break; }
case "TAN": { break; }
case "TRUNCATE": { break; }
case "MAX": { break; }
case "MIN": { break; }
case "IF": { break; }
case "IN": { break; }
}
}
}
[Benchmark(Baseline = true)]
public void TestWithStaticDic()
{
foreach (var functionName in functions)
{
if (!StandardFunctions.TryGetValue(functionName.ToUpperInvariant(), out var functionImplementation))
{
throw new ArgumentException("Function not found", functionName);
}
}
}
}
BenchmarkDotNet v0.13.12, Windows 10 (10.0.19045.3930/22H2/2022Update) Intel Core i3-6100U CPU 2.30GHz (Skylake), 1 CPU, 4 logical and 2 physical cores .NET SDK 8.0.200 [Host] : .NET 6.0.27 (6.0.2724.6912), X64 RyuJIT AVX2 DefaultJob : .NET 6.0.27 (6.0.2724.6912), X64 RyuJIT AVX2
Method | Mean | Error | StdDev | Ratio |
---|---|---|---|---|
TestWithToUpperInvariant | 212.5 ns | 3.20 ns | 2.84 ns | 0.64 |
TestWithStaticDic | 331.1 ns | 2.17 ns | 1.92 ns | 1.00 |
@Bykiev , I'm not entirely surprised.
I used the invariant switch you posted and benchmarked Abs, Log, In, and non-existent function "Foo". I don't think it matters what order the cases are in.
@david-brink-talogy, it's just a ns, the order doesn't matter with invariant switch, the most time occupies converting function name to upper. Let's merge this PR
Let's merge this PR
Agreed, it looks good.
Remove excessive check for case when case is ignored