Improve performance of Matcher

jamesbascle commented 7 years ago

One of the things that makes the matcher slower than it otherwise need be is the need to construct a new matcher for each call, and check the type in the match call at that exact time, then continue to fall through.

We could explore a number of options:

Early termination of the matching when a match is found and a result is calculated.
Having Matcher<T0,T1,T2...> Implement all combinations of interfaces of T*, ie IMatcher<T0,T1,T2,TResult>, IMatcher<T0,T1,TResult>, IMatcher<T0,T2,TResult>, etc. Can maybe be done by having each matcher subclass the one of N-1 arity and adding only the relevant new Matcher classes. We then use a builder pattern to accumulate the passed in functions as a Dictonary<Type, Func<object, TResult>>. This will require only 2^N interface implementations, where N is the Number of TN type parameters on the OneOf. Then, each call to match simply stashes the function away, and returns itself as an IMatcher of lower arity.

jamesbascle commented 7 years ago

@Jagged Let me know your thoughts.

Jagged commented 7 years ago

Matcher is a struct so ctor costs should (theoretically) be minimal, and hopefully inlined with other small methods. We can add [MethodImpl(MethodImplOptions.AggressiveInlining)] to the small methods to improve the chances of inlining.

The x.Match().Match().Match() is effectively Match(Match(Match(x))), so terminating early isn't an option when we're using the fluent style.

Option 2 might work, though to be honest it feels OTT (over the top). It all comes down to whether there really is a performance issue with the ctor. What does a profiler say? Is there an actual performance problem when built in release mode?

jamesbascle commented 7 years ago

Yeah the performance falls off a decent bit on the later cases, actually. Basically monotonically decreasing throughput. I was hoping a builder/accumulator, and lookup on the origType property could help. We could also do something as simple as, like in the switcher, just have a boolean that says whether its matched or not. This would boil down to a bit compare rather than a type comparison.

On Tue, Oct 18, 2016 at 5:07 AM, Will Mooar notifications@github.com wrote:

Matcher is a struct so ctor costs should (theoretically) be minimal, and hopefully inlined with other small methods. We can add [MethodImpl(MethodImplOptions.AggressiveInlining)] to the small methods to improve the chances of inlining.

The x.Match().Match().Match() is effectively Match(Match(Match(x))), so terminating early isn't an option when we're using the fluent style.

Option 2 might work, though to be honest it feels OTT (over the top). It all comes down to whether there really is a performance issue with the ctor. What does a profiler say? Is there an actual performance problem when built in release mode?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jamesbascle/OneOf/issues/1#issuecomment-254450145, or mute the thread https://github.com/notifications/unsubscribe-auth/AD_ohZg7vH4FeHt5uAtEP0QEI_4KEER3ks5q1IxMgaJpZM4KYka0 .

Jagged commented 7 years ago

We're doing a null check against result before attempting the type check, so that should be same speed as checking a boolean flag. Both are 32 bits. Maybe we could try ReferenceEquals to do the null check to ensure we avoid a call to Equals.

Adding an extra field to the struct will cost more time to construct it and also when passing the struct around. I've been thinking on how we could remove fields to keep the struct small, not add more. Otherwise it may become cheaper to use a class. Again, a profiler is our friend here. I'd like to separate the performance tests from the unit tests so we can use a profiler, and also avoid killing VS.

The recent work that added origType and more specifically the more complex Create/CreateRaw, those may have added the biggest hit. I'm not totally convinced that all of that extra work to support and differentiate between both child and parent types in a OneOf is worth it, but I wanted to see where you are going with it. Are you actually using this in real code?

In time when MS add the new switch and hopefully match they will use is semantics, or at least that's what all the previews show so far. It would be great to be able to drop Match and Switch from OneOf in time, if we stay compatible with where MS is heading,

That's my thoughts looking forward anyways... =)

jamesbascle commented 7 years ago

Yeah, using in real code as of yesterday. To me the making it switch on the real type is crucial for my purposes...if a child is passed in as a parent reference, it should execute as the child IMO.

I agree perf should be separated, there why I gave them different test categories. Does that not work for your configuration? I started running dotTrace on it sunday, but didn't get too far into it.

For the T5 and T6 cases, it is pretty obviously the dynamic cast. I think we could potentially cache that cast's result on a OneOf type by type basis since static fields are unique per generic parameter combo.

For the others I'm just going to sit down and switch up some things and see where I can tighten it up according to dotTrace.

On Oct 18, 2016 6:32 AM, "Will Mooar" notifications@github.com wrote:

We're doing a null check against result before attempting the type check, so that should be same speed as checking a boolean flag. Both are 32 bits. Maybe we could try ReferenceEquals to do the null check to ensure we avoid a call to Equals.

Adding an extra field to the struct will cost more time to construct it and also when passing the struct around. I've been thinking on how we could remove fields to keep the struct small, not add more. Otherwise it may become cheaper to use a class. Again, a profiler is our friend here. I'd like to separate the performance tests from the unit tests so we can use a profiler, and also avoid killing VS.

The recent work that added origType and more specifically the more complex Create/CreateRaw, those may have added the biggest hit. I'm not totally convinced that all of that extra work to support and differentiate between both child and parent types in a OneOf is worth it, but I wanted to see where you are going with it. Are you actually using this in real code?

In time when MS add the new switch and hopefully match they will use is semantics, or at least that's what all the previews show so far. It would be great to be able to drop Match and Switch from OneOf in time, if we stay compatible with where MS is heading,

That's my thoughts looking forward anyways... =)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jamesbascle/OneOf/issues/1#issuecomment-254469434, or mute the thread https://github.com/notifications/unsubscribe-auth/AD_ohRuBoDkpE74CPueLODFkLXWZ35snks5q1KArgaJpZM4KYka0 .

jamesbascle commented 7 years ago

And yeah, they will be doing is semantics, but also allowing break statement. So if you order your cases properly, you don't execute the parent class/default case since you can break out, whereas with ours you will be guaranteed to run just the once. We also have restrictions on the array of types to pass in, and they don't have to be in a particular hierarchy to be sensible.

Jagged commented 7 years ago

I agree perf should be separated, there why I gave them different test categories. Does that not work for your configuration? Sadly not, I'm running the NUnit tests directly within VS's Test Explorer using the NUnit Test Adapter. With the option for automatically running tests after building enabled, it also runs the perf tests. I got caught out the other day when I had the option turned off and committed a buggy build.

it is pretty obviously the dynamic cast. It will be interesting to determine the most specific Create method to call. Probably using type.IsAssignableFrom, where x.IAF(y) && !y.IAF(x) means x is the more specific type.

jamesbascle commented 7 years ago

I was thinking we add a static cache that lets us stash which method to call, and a wrapper for the create methods that gets called the first time for each type. So the dynamic call only has to be run once per type per appdomain.

On Oct 18, 2016 3:59 PM, "Will Mooar" notifications@github.com wrote:

I agree perf should be separated, there why I gave them different test categories. Does that not work for your configuration? Sadly not, I'm running the NUnit tests directly within VS's Test Explorer using the NUnit Test Adapter. With the option for automatically running tests after building enabled, it also runs the perf tests. I got caught out the other day when I had the option turned off and committed a buggy build.

it is pretty obviously the dynamic cast. It will be interesting to determine the most specific Create method to call. Probably using type.IsAssignableFrom, where x.IAF(y) && !y.IAF(x) means x is the more specific type.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jamesbascle/OneOf/issues/1#issuecomment-254621765, or mute the thread https://github.com/notifications/unsubscribe-auth/AD_ohS-RffnhMlu9QIu699un7to0RJ86ks5q1SUxgaJpZM4KYka0 .

jamesbascle commented 7 years ago

I see. I was originally going to put the perf tests in an entirely other project, but decided categories would work just fine. I did not consider your use case of just base VS, which of course I should have. Feel free to pull em over, or I'll do it next time i get to working on the code.

jamesbascle commented 7 years ago

We could also probablycut down the execution time a good bit by just wrapping each benchmark's code in a for loop of about 100-1000 iterations so the benchmark framework code doesn't execute so much.

jamesbascle / OneOf

Improve performance of Matcher #1