tc39 / proposal-extractors

Extractors for ECMAScript
http://tc39.es/proposal-extractors/
MIT License
200 stars 3 forks source link

Concerns about the scope of the solution #19

Closed eemeli closed 4 months ago

eemeli commented 4 months ago

I started writing this in reply to https://github.com/tc39/proposal-extractors/issues/18#issuecomment-2040769594, but realized that I was sidetracking that issue. My concerns aren't really about the problem statement, but about the scope within which a solution to the problem is desirable.

Overall, I get the value of extraction within the context of pattern matching, but not really outside that. Inside match or with an is test, the reader is primed to expect validation and, potentially, extraction. But in destructuring or elsewhere, not so much. In those other places, it comes as added complexity, and doesn't really bring a great benefit compared to current syntax patterns.

Reviewing past presentations and discussions of this proposal, it seems like I'm echoing concerns raised by @brad4d during the 2022-09 meeting:

BSH: Yeah, my main concern here is, I feel like this feature is creating a way to write code that's very terse and clever but not necessarily easier to get right or easier to understand. In several of the examples, it looks to me like, you'd have to - to really know what's going on you would have to go and read what the extractor code does to understand. Well, what things can I use this extractor with? And what can I put on the right hand side? I don't see the benefit to balance out the readability. or what, wait, what does this in some way? Let you write more efficient code easily. or does it somehow - aside from the magical feeling and cleverness, what's the benefit?

[...] So I think what would address this - Not something you can do now, but what would address my concern here, is if I could have a solid example of some code. Like, here's how you would write the code with this feature available and here's how you write it without it, and it's not just that it's shorter, but it should be actually easier to understand and maybe more performant. because, there's some benefit that you're getting performance-wise out of like the environment doing this for you. That's what I would need to see. That's all I'm saying. I'm not trying to block this or anything. I just that's that's my concern. I feel like it's just feels too clever and not really overall beneficial. That's it for me.

In other words, my concern is that the examples currently in the readme or presented in the preceding discussion don't really seem like they present the new syntax as "easier to understand and maybe more performant".


Extractors encourage a leaner coding style that is often preferred by JS developers and is regularly seen in other languages with language features like Pattern Matching:

// (assumes Point.{ x, y } are public for the purpose of this example)

const { p1, p2 } = line; // top-level destructuring
if (!(p1 instanceof Point) || !(p2 instanceof Point)) throw new Error(); // data validation
const { x: x1, y: y1 } = p1; // nested destructuring
const { x: x2, y: y2 } = p2; // nested destructuring

// vs
const { p1: Point(x1, y1), p2: Point(x2, y2) } = line;

On the other hand, if we were to presume an equivalent Point definition like

class Point {
  #x;
  #y;
  constructor(x, y) {
    this.#x = x;
    this.#y = y;
  }
  static extract(subject) {
    return #x in subject ? [subject.#x, subject.#y] : false;
  }
}

the user code could look like

const [x1, y1] = Point.extract(line.p1);
const [x2, y2] = Point.extract(line.p2);

Sure, that doesn't do the work all on one line and during the destructuring, but the result is the same and much clearer to a reader, given that each part of the code is only doing one thing.

In addition to the conciseness of Extractors in destructuring, they also allow for optimal short-circuiting in Pattern Matching:

match (shape) {
  when { p1: Point(let p1), p2: Point(let p2) }: drawLine(p1, p2);
  ...
}

Here, the brand check in Point[Symbol.customMatcher] allows the pattern to attempt to match p1 and if that fails, short-circuit to the next when clause.

Ok, I'll grant you that in this case we do save one access of shape.p1 and shape.p2 compared to what's possible with switch, which also requires the break:

switch (true) {
  case shape.p1 instanceof Point && shape.p2 instanceof Point:
    drawLine(shape.p1, shape.p2);
    break;
  ...
}

Presuming that a Line class was being used, and with perhaps more common JS, those issues can be avoided:

if (shape instanceof Line) drawLine(shape.p1, shape.p2);
else ...

Another place they shine is within parameter lists where there is no Statement context within which to split out code without shifting work to the body:

function encloseLineInRect(
  {p1: Point(x1, y1), p2: Point(x2, y2)},
  padding = defaultPadding(x1 - x2, y1 - y2)
) {
  const xMin = Math.min(x1, x2) - padding;
  const xMax = Math.max(x1, x2) + padding;
  const yMin = Math.min(y1, y2) - padding;
  const yMax = Math.max(y1, y2) + padding;
  return new Rect(xMin, yMin, xMax, yMax);
}

function defaultPadding(dx, dy) {
 // use larger padding if x or y coordinates are close
 if (Math.abs(dx) < 0.5 || Math.abs(dy) < 0.5) {
   return 2; 
 }
 return 1;
}

const rect = encloseLineInRect(line);

Here, extractors let you validate the input arguments and extract coordinates without needing to shift default argument handling for padding to the body.

On the other hand, if we were to presume that Point provided public accessors x and y, our code could look like:

function encloseLineInRect(
  { p1: { x: x1, y: y1 }, p2: { x: x2, y: y2 } },
  padding = defaultPadding(x1 - x2, y1 - y2)
) {
  ...
}

If the author of Point did require always getting both coordinates at the same time, then it'd probably make more sense to handle the default padding in the body:

function encloseLineInRect({ p1, p2 }, padding) {
  const [x1, y1] = Point.extract(p1);
  const [x2, y2] = Point.extract(p2);
  padding ??= defaultPadding(x1 - x2, y1 - y2);
  ...
}

Actually, having written out those alternatives, this last one is by far the clearest to me, as each part of it is only doing one thing, and it's much clearer that extracting the values actually involves some work.

Why should having this code in the function body be considered a bad thing?

rbuckton commented 4 months ago

My concerns aren't really about the problem statement, but about the scope within which a solution to the problem is desirable.

Overall, I get the value of extraction within the context of pattern matching, but not really outside that. Inside match or with an is test, the reader is primed to expect validation and, potentially, extraction. But in destructuring or elsewhere, not so much. In those other places, it comes as added complexity, and doesn't really bring a great benefit compared to current syntax patterns.

When I first discussed extractors with the Pattern Matching champions, their position was that Extractors must be part of destructuring if they are included in pattern matching, and I agree. There are clear parallels between the two syntaxes. Destructuring is, in its way, a limited form of pattern matching with no branching. Statements like const {x, y} = obj; and const [a, b] = ar; do perform validation, with the first expecting an object and the second expecting an iterable.

I've also received a fair amount feedback from @littledan and others that, should pattern matching have this capability, then the same capability must be available in destructuring as well.

If we limited extractors to pattern matching alone we end up with a significant discrepancy between the conditional pattern syntax and non-conditional destructuring. This heavily impacts refactoring in cases where a function that previously accepted multiple possible types of inputs is rewritten to accept a single type of input. For example, let's say you are rewriting code containing a branching match clause`:

match (value) {
    when Point(let x, let y): ...;
    when Line(let p1, let p2): ...;
}

As part of this rewrite, you determine that value can now only receive Point values, thus the other match legs are unnecessary. With extractors in destructuring, the refactor is fairly straightforward:

const Point(x, y) = value;

It is also consistent with an equivalent refactor from when [let a, let b] to const [a, b] or when { let x, let y } to const { x, y }.

Without extractors in destructuring, you have to either convert to is and variable patterns and roll your own error handling

if (!(value is Point(const x, const y))) throw new Error();

or manually validate and read values across multiple statements. While is/const is a valid use for variable patterns, it's far less readable or convenient than const Point(x, y). If you refactor to multiple statements, you have to ensure you can replicate the same validation logic the extractor performs (probably by also using is), and the extracted values must be reachable outside of the extractor. That refactoring might require visual inspection of the [Symbol.customMatcher] method to ensure rules are consistently applied, which will likely result in code duplication.

The extractor concept is not a new one. The syntax and semantics are directly inspired by Scala and Rust. For example, Rust supports let (x, y) = obj;, which is roughly the equivalent of const [x, y] = obj; in JS, and let Point(x, y) = obj; which would be the equivalent of const Point(x, y) = obj; in JS. In Rust every variable binding is just pattern matching, but that's not an option for JS. In the Pattern Matching proposal I've already had discussions with the champions about the possibility of extending the existing destructuring syntax to a full pattern matching syntax, but that direction was ultimately rejected — destructuring offers no mechanism to reference a value in scope, which would be necessary for patterns like x is +Infinity or x is undefined, and does not have the correct exhaustiveness requirements for array patterns since [1, 2] in pattern matching is exhaustive (input must have exactly two elements) while [a, b] in destructuring is not exhaustive.

Reviewing past presentations and discussions of this proposal, it seems like I'm echoing concerns raised by @brad4d during the 2022-09 meeting:

BSH: Yeah, my main concern here is, I feel like this feature is creating a way to write code that's very terse and clever but not necessarily easier to get right or easier to understand. In several of the examples, it looks to me like, you'd have to - to really know what's going on you would have to go and read what the extractor code does to understand. Well, what things can I use this extractor with? And what can I put on the right hand side? I don't see the benefit to balance out the readability. or what, wait, what does this in some way? Let you write more efficient code easily. or does it somehow - aside from the magical feeling and cleverness, what's the benefit? [...] So I think what would address this - Not something you can do now, but what would address my concern here, is if I could have a solid example of some code. Like, here's how you would write the code with this feature available and here's how you write it without it, and it's not just that it's shorter, but it should be actually easier to understand and maybe more performant. because, there's some benefit that you're getting performance-wise out of like the environment doing this for you. That's what I would need to see. That's all I'm saying. I'm not trying to block this or anything. I just that's that's my concern. I feel like it's just feels too clever and not really overall beneficial. That's it for me.

In other words, my concern is that the examples currently in the readme or presented in the preceding discussion don't really seem like they present the new syntax as "easier to understand and maybe more performant".

I've since added further examples to the explainer and in the slides presented in February, but I'm happy to add more.

I believe the extractor syntax itself is not terribly difficult to learn and understand, especially if the same syntax in pattern matching gains significant traction in the community, which I expect it will. The most common forms of extractors will be duals to constructors:

const p = new Point(1, 2);
const Point(x, y) = p;

This duality is also an important characteristic to a planned future proposal for ADT enums, as an evolution of the enum proposal Jack Works previously presented and the enum proposal I'd been working on. The current design for ADT enums would maintain this duality, leveraging extractors as a JS mechanism to mirror similar mechanisms in languages like Scala, Rust, C#, etc.:

enum Message {
    FillBackground({ r, g, b }),
    DrawText(text, p, { halign = "center", valign = "middle" } = {}),
    DrawLine(p1, p2, color, [endcap1 = "dot", endcap2 = "dot"] = [])
}

function onMessage(msg) {
  match (msg) {
    when Message.FillBackground({ let r, let g, let b }): gfx.fill(r, g, b);
    when Message.DrawText: drawText(msg);
    when Message.DrawLine: drawLine(msg);
  }
}

function drawText(msg) {
  const Message.DrawText(text, p, { halign, valign }) = msg;
  ...
}

function drawLine(msg) {
  const Message.Line(p1, p2, color, endcaps) = msg;
  drawStroke(p1, p2, color);
  drawEndcaps(p1, p2, color, endcaps);
}

One of my main focus areas for ADT enums has been the potential for significantly better performance than regular JS objects due to a fixed shape and fixed domain, which have the potential to positively impact IC optimizations. I've only recently started discussing the possibilities of ADT enums with V8 as I've been fleshing out the design and goals for that proposal.

As a result, performance is a key area of focus for extractors as well. The worst performing part of destructuring today is array destructuring. As far as I am aware, no engine optimizes array destructuring today. This gives us two directions to pursue for extractors. First, that we could specify that the return value from an extractor must be an Array, such that we can leverage length and ordinal indexes for destructuring, which has the potential to be as fast as const { 0: x, 1: y } = obj is in most engines today. Second, that we can work with implementers to improve "array" destructuring performance in general for Array (or even typed arrays), which has the potentential to postively impact performance across the board in destructuring and pattern matching. I find the second option the most promising one as improving the performance of const [x, setX] = useState() would greatly benefit the React community.

Extractors encourage a leaner coding style that is often preferred by JS developers and is regularly seen in other languages with language features like Pattern Matching:

// (assumes Point.{ x, y } are public for the purpose of this example)

const { p1, p2 } = line; // top-level destructuring
if (!(p1 instanceof Point) || !(p2 instanceof Point)) throw new Error(); // data validation
const { x: x1, y: y1 } = p1; // nested destructuring
const { x: x2, y: y2 } = p2; // nested destructuring

// vs
const { p1: Point(x1, y1), p2: Point(x2, y2) } = line;

On the other hand, if we were to presume an equivalent Point definition like

class Point {
  #x;
  #y;
  constructor(x, y) {
    this.#x = x;
    this.#y = y;
  }
  static extract(subject) {
    return #x in subject ? [subject.#x, subject.#y] : false;
  }
}

the user code could look like

const [x1, y1] = Point.extract(line.p1);
const [x2, y2] = Point.extract(line.p2);

Sure, that doesn't do the work all on one line and during the destructuring, but the result is the same and much clearer to a reader, given that each part of the code is only doing one thing.

Once pattern matching advances you would want this logic in a [Symbol.customMatcher] anyways, and explicitly invoking symbol-named methods is generally unfavorable.

This also doesn't have equivalent semantics to what is proposed. If line.p1 in your example is not a Point, you'll receive the rather opaque error false is not iterable. Implementations will be able to provide a more reasonable error using the extractor syntax, such as p1 does not match Point. To get equivalent semantics, you must wrap the call to [Symbol.customMtcher] with in a method that throws when that call returns false, which is a lot of boilerplate code that we can avoid.

The other issue with just having a static extract method is that it doesn't guarantee a consistent API design. For each Point.extract() there might also be a Message.unapply() or Node.deconstruct(). Extractors and Symbol.customMatcher define a common protocol for this mechanism along with syntactic support for that protocol, much like for..of and Symbol.iterator or using and Symbol.dispose.

In addition to the conciseness of Extractors in destructuring, they also allow for optimal short-circuiting in Pattern Matching:

match (shape) {
  when { p1: Point(let p1), p2: Point(let p2) }: drawLine(p1, p2);
  ...
}

Here, the brand check in Point[Symbol.customMatcher] allows the pattern to attempt to match p1 and if that fails, short-circuit to the next when clause.

Ok, I'll grant you that in this case we do save one access of shape.p1 and shape.p2 compared to what's possible with switch, which also requires the break:

switch (true) {
  case shape.p1 instanceof Point && shape.p2 instanceof Point:
    drawLine(shape.p1, shape.p2);
    break;
  ...
}

Presuming that a Line class was being used, and with perhaps more common JS, those issues can be avoided:

if (shape instanceof Line) drawLine(shape.p1, shape.p2);
else ...

For this narrow case, maybe, but who is to say for a given Line the p1 property might not be a Point | Point3D that you must distinguish between, or any object having a property whose value could be any constituent of a discriminated union.

In general, switch (true) is a bit of an antipattern, but not only for the reasons that you mentioned. Yes, one of the values of switch (foo) is that it avoids repetition of foo, but switch (true) limits optimizations for case clauses since each clause must be reevaluated every time the switch statement is run, meaning there's no opportunity to optimize the switch into, say, a jump list. It's likely that match will suffer from that limitation as well by its very nature, but it's a poor choice for switch.

Another place they shine is within parameter lists where there is no Statement context within which to split out code without shifting work to the body:

function encloseLineInRect(
  {p1: Point(x1, y1), p2: Point(x2, y2)},
  padding = defaultPadding(x1 - x2, y1 - y2)
) {
  const xMin = Math.min(x1, x2) - padding;
  const xMax = Math.max(x1, x2) + padding;
  const yMin = Math.min(y1, y2) - padding;
  const yMax = Math.max(y1, y2) + padding;
  return new Rect(xMin, yMin, xMax, yMax);
}

function defaultPadding(dx, dy) {
 // use larger padding if x or y coordinates are close
 if (Math.abs(dx) < 0.5 || Math.abs(dy) < 0.5) {
   return 2;
 }
 return 1;
}

const rect = encloseLineInRect(line);

Here, extractors let you validate the input arguments and extract coordinates without needing to shift default argument handling for padding to the body.

On the other hand, if we were to presume that Point provided public accessors x and y, our code could look like:

function encloseLineInRect(
  { p1: { x: x1, y: y1 }, p2: { x: x2, y: y2 } },
  padding = defaultPadding(x1 - x2, y1 - y2)
) {
  ...
}

If the author of Point did require always getting both coordinates at the same time, then it'd probably make more sense to handle the default padding in the body:

function encloseLineInRect({ p1, p2 }, padding) {
  const [x1, y1] = Point.extract(p1);
  const [x2, y2] = Point.extract(p2);
  padding ??= defaultPadding(x1 - x2, y1 - y2);
  ...
}

Actually, having written out those alternatives, this last one is by far the clearest to me, as each part of it is only doing one thing, and it's much clearer that extracting the values actually involves some work.

Why should having this code in the function body be considered a bad thing?

This doesn't necessarily have the same semantics as the example I posted. ??= includes null, while parameter defaults do not, and a different function might consider null to be a valid argument with an independent meaning. Also, as I've mentioned before this doesn't provide a useful error at the sites of each .extract call, unless .extract itself throws on an invalid input, and that again is boilerplate you must provide for every .extract method you write.

This is a fairly limited example. I've seen real world code with functions that have 8+ parameters, some with consecutive defaults based on prior arguments and their defaults. If any of those defaults would depend on a destructured input, you might have to shift all of that logic to the body to preserve evaluation order. That can have other effects as well, such as changing the .length property of the function. If .length is an important characteristic for your function, you're then having to either replace parameter defaults with = undefined or rely on arguments or a rest parameter. In non-trivial cases all of these papercuts can add up.

In addition, unless we can convince implementations to optimize array destructuring, extractors will likely require Array return values and leverage an Array-as-object destructuring approach. If that is the case, const Point(x, y) = obj could be up to 30% faster than const [x, y] = Point.extract(obj) because normal array destructuring is so slow (based on some rudimentary microbenchmark comparisons of object destructuring vs array destructuring in V8).

eemeli commented 4 months ago

So is this proposal predicated on pattern matching, or should its merits be assessed independently of that proposal? Most of the arguments being made for this proposal appear to rely on pattern matching, but it's being advanced independently and ahead of pattern matching, which would suggest the opposite.

Right now I'm struggling a bit to identify the value of extractor objects without reference to pattern matching, which, if accepted, would indeed bring in custom matcher methods to account for pattern matching's inability to include more than one expression per case.

In that pattern matching world, it does become more interesting to use the same matching syntax outside the match, but even then it's not clear to me that the destructor approach proposed here is the right one. Compared to what's currently proposed for pattern matching, destructor extractors:

  1. Throw rather than return false if the custom matcher fails.
  2. Have completely different syntax inside the parentheses.

Without do-expressions, it's also not at all clear that the refactoring complexities of taking a single expression out of a match and adjusting it for life outside would be actually hard.

Separately from pattern matching, you've also brought up algebraic data types as something with which extractors would be useful. If that should be considered a part of the value proposition here, is there a better definition of exactly what they might look like, and should they too be considered as a dependency of this proposal?

Finally, inline replies to your last three paragraphs, which consider our current world without pattern matching:

This doesn't necessarily have the same semantics as the example I posted. ??= includes null, while parameter defaults do not, and a different function might consider null to be a valid argument with an independent meaning.

Sure, null could be a valid value so the ??= shorthand is not universal. In this case, though, the padding is being added to a number. But this is rather irrelevant to the larger point of default values being definable in the function body, often with succinct syntax and clear meaning.

Also, as I've mentioned before this doesn't provide a useful error at the sites of each .extract call, unless .extract itself throws on an invalid input, and that again is boilerplate you must provide for every .extract method you write.

You're right, an extractor object does allow for slightly more informative errors than the current "false is not iterable" or similar. However, those errors do already tend to come with a source reference, so this seems like a marginal improvement.

This is a fairly limited example. I've seen real world code with functions that have 8+ parameters, some with consecutive defaults based on prior arguments and their defaults. If any of those defaults would depend on a destructured input, you might have to shift all of that logic to the body to preserve evaluation order. That can have other effects as well, such as changing the .length property of the function. If .length is an important characteristic for your function, you're then having to either replace parameter defaults with = undefined or rely on arguments or a rest parameter. In non-trivial cases all of these papercuts can add up.

Sure, code like that exists, and refactoring it can be a pain. Though tbh 8+ parameters or depending on function length kinda sounds like it's complicated enough without defaults depending on previous destructured inputs. On the flipside, do I understand right that the cost of this would be to accept code like this as valid?

function foo(bar(baz)) { ... }

Actually, can extractors nest? As in, could this too be valid?

function foo(bar(baz(quux))) { ... }

In addition, unless we can convince implementations to optimize array destructuring, extractors will likely require Array return values and leverage an Array-as-object destructuring approach. If that is the case, const Point(x, y) = obj could be up to 30% faster than const [x, y] = Point.extract(obj) because normal array destructuring is so slow (based on some rudimentary microbenchmark comparisons of object destructuring vs array destructuring in V8).

Have you tested the performance of a custom extractor against code for a Point implementation that doesn't try to hide its coordinates?

const x = obj.x;
const y = obj.y;

// or

const { x, y } = obj;
rbuckton commented 4 months ago

So is this proposal predicated on pattern matching, or should its merits be assessed independently of that proposal? Most of the arguments being made for this proposal appear to rely on pattern matching, but it's being advanced independently and ahead of pattern matching, which would suggest the opposite.

Extractors have been proposed for both pattern matching and for destructuring, with Extractors for pattern matching folded into the pattern matching proposal itself. The pattern matching proposal is large enough on its own that it is best to have a separate proposal for Extractors for destructuring and manage the commonalities as cross-cutting concerns between the two proposals.

Though there is a strong link to pattern matching, Extractors are intended to stand on their own. They fill a capability gap in the destructuring syntax and, as @erights put during the February plenary, "[the Extractor syntax] has a tremendous amount of reach. And it’s fairly small and elegant for something with this much reach, this many different things you can apply it to in a coherent and unified manner."

When I presented an update on Extractors in February, I'd indicated that I planned to bring Extractors for Stage 2 around the time that Pattern Matching was proposed for Stage 2. At that time I was encouraged by many on the committee to consider advancing the proposal independently regardless as to whether Pattern Matching was also seeking advancement as they need not advance in lock step so long as cross-cutting concerns are maintained.

Right now I'm struggling a bit to identify the value of extractor objects without reference to pattern matching, which, if accepted, would indeed bring in custom matcher methods to account for pattern matching's inability to include more than one expression per case.

I am not clear on what you mean by "inability to include more than one expression per case". Can you clarify? Do you mean the inability to accept multiple arguments? Extractors are meant to work on a single input value to produce zero or more output values, just as constructors accept zero or more inputs to produce a single output. There is an intentional duality behind the design. This one-to-many behavior is also explicitly consistent with other destructuring patterns which take one object or one array to produce zero or more bindings. Extractors are unary functions by design.

In that pattern matching world, it does become more interesting to use the same matching syntax outside the match, but even then it's not clear to me that the destructor approach proposed here is the right one. Compared to what's currently proposed for pattern matching, destructor extractors:

  1. Throw rather than return false if the custom matcher fails.

Just as in Rust, destructuring extractors are irrefutable. There is no alternative to pursue when you write const Point(x, y) = obj, so the only option is to throw. This is, again, consistent with {} and [] in destructuring which throw if the input is not an object or an iterable, respectively. This is also consistent with match expressions which will also throw if the subject is not matched by any when clause. Also, it is not the extractor object that throws, but the extractor syntax. The [Symbol.customMatcher] method is still expected to return false (or any falsy value) when the match fails.

  1. Have completely different syntax inside the parentheses.

This syntax difference is precisely the same syntax difference as [] and {} have between pattern matching and destructuring. If there is a cost to be paid in learning the differences between pattern matching and destructuring, that cost will be paid regardless as to whether extractors are present. let/const patterns in pattern matching are a necessity to disambiguate between identifier references and bindings, which is a distinction pattern matching must make but destructuring does not. In my opinion, (2) is an example of consistency in the design, not inconsistency.

Without do-expressions, it's also not at all clear that the refactoring complexities of taking a single expression out of a match and adjusting it for life outside would be actually hard.

Can you clarify what you mean here? I'm not sure the presence or absence of do expressions affects this at all.

Separately from pattern matching, you've also brought up algebraic data types as something with which extractors would be useful. If that should be considered a part of the value proposition here, is there a better definition of exactly what they might look like, and should they too be considered as a dependency of this proposal?

This proposal does not depend on ADT enums, but is certainly informed by them. Its too early to say whether ADT enums will have the performance characteristics I'm hoping for. There have been discussions about them here and here, though neither fully represent my current thinking on the approach. I'm not ready to share the new design publicly yet, but I'd be happy to discuss some of the thinking in an offline discussion.

Sure, code like that exists, and refactoring it can be a pain. Though tbh 8+ parameters or depending on function length kinda sounds like it's complicated enough without defaults depending on previous destructured inputs. On the flipside, do I understand right that the cost of this would be to accept code like this as valid?

function foo(bar(baz)) { ... }

Actually, can extractors nest? As in, could this too be valid?

function foo(bar(baz(quux))) { ... }

Yes, this would be valid, and yes, extractors can nest. Most extractors are likely to be classes (or ADT enums when that time comes) and the generally accepted naming convention for such definitions in JS is pascal-case, thus the more common case will be something like

function foo(Bar(baz)) { ... }

and thus be more readily distinguished, not to mention that most editors will apply syntax highlighting that further distinguishes an extractor from a parameter.

While likely not the norm, nested extractors have their place when combined with utility types like Option. The Message refactoring I mentioned above could very easily be an Option.Some(Message.DrawText(text)), but its rare to see anything more complex in the wild in other languages.

Have you tested the performance of a custom extractor against code for a Point implementation that doesn't try to hide its coordinates?

const x = obj.x;
const y = obj.y;

// or

const { x, y } = obj;

If you break down an extractor to its independent steps and use "Array-as-object" destructuring, extractors would essentially have the same overhead as this:

const { 0: x, 1: y } = Point.extract(obj);

The 30% number above came from some microbenchmarks we ran comparing the performance of the following two statements in React:

// scenario 1 - array destructuring
const [foo, setFoo] = setState();

// scenario 2 - array-as-object destructuring
const { 0: foo, 1: setFoo } = setState();

The "Array-as-object" destructuring scenario was ~30% faster than iterator destructuring. We don't have any numbers comparing { 0: x } destructuring to { x: x } destructuring, though it's possible implementations could handle numeric index properties in a less efficient way than regular properties. Aside from that, there is expected overhead incurred when invoking user code, whether its Point.extract() or Point[Symbol.customMatcher](). If Point doesn't hide its coordinates and you don't need the brand check, then { x, y } will obviously be faster. But if you need the brand check or some other data transformation behavior in addition to destructuring, you'd be paying that cost regardless.

If implementations don't opt to invest in array destructuring improvements, and Extractors choose to use "Array-as-object" destructuring as a result, I expect someone will write

const Elements = {
  [Symbol.customMatcher](subject) {
    return Array.isArray(subject) && subject;
  }
};

to get that speed boost for destructuring when the input is known to be an array:

const [a, b] = ar; // iterator destructuring
// vs.
const Elements(a, b) = ar; // "Array-as-object" destructuring

which is likely still faster than normal array destructuring even with the user-code overhead, and I expect function inlining could make that overhead fairly negligible as well.

rbuckton commented 4 months ago

I should also note that Pattern Matching is on the agenda for possible advancement to Stage 2 at this plenary.

eemeli commented 4 months ago

Right now I'm struggling a bit to identify the value of extractor objects without reference to pattern matching, which, if accepted, would indeed bring in custom matcher methods to account for pattern matching's inability to include more than one expression per case.

I am not clear on what you mean by "inability to include more than one expression per case". Can you clarify? Do you mean the inability to accept multiple arguments? Extractors are meant to work on a single input value to produce zero or more output values, just as constructors accept zero or more inputs to produce a single output. There is an intentional duality behind the design. This one-to-many behavior is also explicitly consistent with other destructuring patterns which take one object or one array to produce zero or more bindings. Extractors are unary functions by design.

I meant that the body of a pattern matching case is a single expression, rather than allowing multiple statements as switch does. This means that in a match, there's no "function body" in which to e.g. destructure values so that they can be used, and therefore it's much more useful to be able to get it all done in the MatchPattern.

Without do-expressions, it's also not at all clear that the refactoring complexities of taking a single expression out of a match and adjusting it for life outside would be actually hard.

Can you clarify what you mean here? I'm not sure the presence or absence of do expressions affects this at all.

Without do-expressions, the pattern match body is a single expression for each case, and therefore it's reasonable to expect it to be relatively simple, such as a single function call. Refactoring something like that is in general simpler than refactoring a generic block of code. Of course the body could be an IIFE or something, but that's unlikely to be too common.

eemeli commented 4 months ago

Closing in favour of #20, as runtime types are a pretty clear value proposition.

rbuckton commented 4 months ago

Without do-expressions, the pattern match body is a single expression for each case, and therefore it's reasonable to expect it to be relatively simple, such as a single function call. Refactoring something like that is in general simpler than refactoring a generic block of code. Of course the body could be an IIFE or something, but that's unlikely to be too common.

If do expressions do not materialize, it's possible that match will subsume a limited form do to support statement execution, e.g.,

match (value) {
  when pattern: do {
    // statements
  }
}

The ability to support statements is important to many of the pattern matching champions as they also want match to be a "better" switch.

erights commented 4 months ago

See https://github.com/tc39/proposal-pattern-matching/issues/322 , especially the end of the thread starting at https://github.com/tc39/proposal-pattern-matching/issues/322#issuecomment-2045865071