tc39 / proposal-pattern-matching

Pattern matching syntax for ECMAScript
https://tc39.es/proposal-pattern-matching/
MIT License
5.5k stars 89 forks source link

Vague runtime performance concerns about custom matchers #253

Closed syg closed 7 months ago

syg commented 2 years ago

The Result example using the custom matcher protocol gives me some pause as an implementer. Checking for errors in function return values seem to be a pretty hot-path thing. I can see this pattern becoming very popular. And so it gives me pause that a possibly very popular pattern seems fairly expensive during runtime: to look up user methods, run it, create match result objects, etc.

Any thoughts on optimization strategies?

Jack-Works commented 2 years ago

to look up user methods

maybe #252? it can move "lookup user methods" to another syntax. ${} just do the value matching.

run it

there is nothing we can do... the core of the custom matcher is to run it 😂 maybe the engine can "inline" it to make it fast when the matcher is very simple (like Result)?

create match result objects

have no idea about this. how can we contain the result otherwise?

ljharb commented 2 years ago

Wouldn't any code doing this already need a Promise catch, or a try/catch, which seems a lot slower than introspecting a return object?

tabatkins commented 2 years ago

Depending on the code, it might just be doing something equivalent to if(ret == -1), which is indeed much faster.

syg commented 2 years ago

Depending on the code, it might just be doing something equivalent to if(ret == -1), which is indeed much faster.

Right, this is my worry, that this new proposal enables large refactorings of codebases in the name of better software engineering at significant runtime cost. We can't have zero-cost abstractions, but I still care about cheap-ish abstractions.

syg commented 2 years ago

or a try/catch

Also a try/catch isn't necessarily slower if error cases are exceptional -- the slowness is when a thing is actually thrown and need to be caught. In this case, inspecting a return value and unwrapping it will be done unconditionally on all branches.

ljharb commented 2 years ago

node's async callbacks both historically and primarily use "errorbacks" - ie, a function callback with a signature of (error, result), and so all these callbacks have if (err) { /* do something *'/ } in them. is this a different perf concern?

syg commented 2 years ago

node's async callbacks both historically and primarily use "errorbacks" - ie, a function callback with a signature of (error, result), and so all these callbacks have if (err) { / do something '/ } in them. is this a different perf concern?

I'm confused, what's the relevance of that?

ljharb commented 2 years ago

@syg it's a common userland example of introspecting an argument (albeit not a return type) for errors, that doesn't cause performance issues.

syg commented 2 years ago

@syg it's a common userland example of introspecting an argument (albeit not a return type) for errors, that doesn't cause performance issues.

Naively I'd assume that's because the cost is dominated by async handling overhead. The Result example from the slide would also apply to sync code, where it's more of a worry.

shaylew commented 1 year ago

Not a JS implementer here but coming from functional languages with fast case constructs this is also a bit concerning to me. In a dynamic language you're going to pay a bit for any abstraction like this, but it's kind of a lot to have to (a) look up a method, (b) call it, (c) create an object, and (d) access the matched field, for each pattern in the case.

Coming from FP languages with native case, the thing I'd really want is to not have pattern matching on ADTs be linear in the number of branches. This seems conceivably doable with object patterns containing a key that's always a constant discriminator, if engines decide it's worth optimizing that. But unless the API is unworkably restrictive, "check each pattern once until something matches" seems like the best you can do for custom matchers, which have none of the guarantees of statelessness or exclusivity that ordinary ADT constructor matching would have. (Would match expressions have saved TS from having to switch to a manual jump table? Is avoiding that pattern a design goal?)

As for trying to make custom matchers at least slightly more comparable to the res === -1 check...

Without changing custom matchers: For ADTs/sum types specifically, you can avoid creating match result objects by arranging for your ADT values to already fit the custom matcher protocol. With ADT objects shaped like { matched: true, value: V } you can return the ADT itself as the match on success and reuse one { matched: false } object for all failures,

Requires cleverness from library authors (which ends up exposed as weird extra properties to library users), isn't fully general, avoids (c).

For sum types with 0-argument constructors -- like the prototypical Result example -- if the library ensures there's only ever one Result.None, they can use the non-method version of the [Symbol.matcher], which avoids (b) as well (but only for None, not Some).

Use a protocol with a designated "match failed" sentinel, and treat all other return values as the match success output. Presumably something like Symbol.matchFailed, so it doesn't happen by accident. This introduces some ugly edge cases (matching against Result.Some(Symbol.matchFailed)) because it gives up parametricity.

Avoids (c) and (d), at the cost of introducing some surprises. Smaller matcher functions and avoiding the object creation/field projection might make inlining more effective at

There's a hybrid approach where you can use the sentinel for failed matches but have to construct an object for successes. This does have parametricity, but it's not clear what it buys you over the current proposal (where libraries can reuse a single match failure object already).

tabatkins commented 1 year ago

I raised a possible approach to helping with this in https://github.com/tc39/proposal-pattern-matching/issues/303 - thinking we could split the plain Foo variable pattern from the more complex Foo(...) extractor patterns. The former would just need to return true/false, which I think is the absolute minimum cost we can get if they're user-extensible; only the latter would need to construct an iterable (and that's precisely the case where the iterable is used, so the cost is justified).

tabatkins commented 7 months ago

Closing this, as I think the spec addresses the concerns well now, in two ways:

  1. We pass a second options-bag argument to the custom matcher, which has a matchType key that tells you whether it's being matched as a boolean or as an extractor. So you can avoid doing the expensive work when someone is just doing when MyClass, and only perform it when they actually write when MyClass(foo, bar).
  2. We added a special behavior for extractors when the returned value is literally an Array, that just checks the length property and the indexed properties, rather than invoking the iterator protocol. This should be the very common case, and much much easier to optimize for.