Error subclasses: how are they checked?

ljharb commented 2 years ago

(from #263)

Checking the [[ErrorData]] slot ensures it's an Error or an Error subclass.

However, when someone matches against, say, TypeError, we only want a TypeError to satisfy this.

Here are some options:

do a Get of name on the constructor, do a Get of the constructor on the value and a Get of name on that constructor, and compare them (most observable, least robust)
do a Get of constructor on the value, and SameValue to the constructor (better than the previous, but still relies on an observable and forgeable lookup)
add a new subclass-specific internal slot to NativeError and AggregateError, and check for it on the value (most robust, but adds a new capability that isn't already present)

ljharb commented 2 years ago

My preference is the third, because it's the most robust.

Userland subclasses, of course, can and should use a private field and #brand in obj, which has the same semantics as option 3.

Jack-Works commented 2 years ago

Current algorithm (as I wrote yesterday):

Requires the [[ErrorData]] internal slot.
Check if the value.constructor equals this, if so, it matches
Check if the value.constructor.name equals this.name, if so, it matches.

This algorithm automatically works on all sub-classes (even from the user land) of the Error class, and it also works cross-realm.

theScottyJam commented 2 years ago

@ljharb With option 3, if I were to do this:

export class MySyntaxError extends SyntaxError {
  customProperty = ...;
}

and then a user of my library tried to do this:

function handleError(error) {
  return match (error) {
    when (${MySyntaxError}): console.error(`Error occured with value: ${error.customProperty}`)
    ...
    default: console.error('Unknown error!')
  };
}

This code will act like it works just fine, until a normal SyntaxError gets passed in, at which point it'll go down the wrong branch and end up logging Error occurred with value: undefined. Is this a correct understanding? This seems like an unfortunate foot-gun is subclasses inherit the pattern-matching behaviors like this. The same applies for inheriting non-error classes as well, but that tends to be less common.

Jack-Works commented 2 years ago

I'm worrying we need to consider all commonly subclassed features:

Map
Set
Array
Error

ljharb commented 2 years ago

@Jack-Works what you wrote yesterday is option 1. However, option 3 also works cross-realm, and all options will work with proper subclasses, since they'll have the proper slots.

Everything that's not Error already will work, because a proper subclass will have the internal slot of the base. It's just that Error subclasses don't have a unique slot - just a slot shared with the base Error.

ljharb commented 2 years ago

@theScottyJam with all three options, your subclass will be treated as a SyntaxError, since it will have the proper slots.

You are certainly correct that your subclass would have to define its own custom matcher if you wanted to be able to robustly match against your subclass instances.

Perhaps another choice would be to store new.target in an internal slot, and then access that - but then Error subclasses would no longer work cross-realm.

theScottyJam commented 2 years ago

Perhaps another choice would be to store new.target in an internal slot, and then access that - but then Error subclasses would no longer work cross-realm.

Could there perhaps be a mix? If you're creating an instance of an Error, or of a built-in subclass of an error, then it uses option 3 and stores some value unique to that builtin-type in an internal slot, but if it's a userland subclass we store new.target?

ljharb commented 2 years ago

I suppose we could do that. Each builtin subclass would check new.target, and if it was SameValue, it'd set the internal slot to the string name, and if not, it'd set it to new.target. That would work cross-realm with all builtins, and same-realm (the only option) with userland subclasses. It'd get a bit more complex to handle in Error itself, but it's doable.

Jack-Works commented 2 years ago

class MyError extends Error {}
class AnotherError extends Error {}

match (new MyError) {
    when (${AnotherError}): can this match?
    when (${MyError}): can this match?
    when (${Error}): can this match?
}

ljharb commented 2 years ago

@Jack-Works yea, if we go with https://github.com/tc39/proposal-pattern-matching/issues/265#issuecomment-1178612254 they would all match as expected.

tabatkins commented 2 years ago

Yeah, #3 is probably what I'd expect. It does indeed mean that subclasses will test as correct, but that's identical to how any other class's subclasses will work, so that seems expected - classes inherit their static members from their superclass as well, which will include the [Symbol.matcher] method. There is literally no way to avoid this without bypassing the class system entirely (some sort of registry?) which seems bad. (Subclassing is, itself, often bad, but that's something every coder needs to learn for themselves, apparently.)

ljharb commented 2 years ago

Agreed. I'll prepare a PR to make that change.

Jack-Works commented 2 years ago

class MyError extends Error {}
class AnotherError extends Error {}

match (new MyError) {
    when (${AnotherError}): can this match?
    when (${MyError}): can this match?
    when (${Error}): can this match?
}

if we can support this, it will be good.

ljharb commented 2 years ago

@Jack-Works #273 as written would have them all match for all Error instances; a subclass would always need to define its own custom matcher to be more specific (just like every other subclass of any other builtin).

theScottyJam commented 2 years ago

but that's identical to how any other class's subclasses will work

I wouldn't want Error subclassing to behave differently. If we were to do a combination of #1 and #3, I think it's best to do it for all built-ins or not do it at all. Error subclasses shouldn't get special treatment in this regard.

classes inherit their static members from their superclass as well, which will include the [Symbol.matcher] method. There is literally no way to avoid this without bypassing the class system entirely (some sort of registry?) which seems bad.

I think it's a matter of point of view as well. One way to view this is "When you use a class when pattern-matching, it does a brand-check operation between your instance and that class", and here, I will define "brand check" to specifically mean "check if the object is an instance of any incarnation of the provided class". In the case of a built-in class, it's possible that there's multiple incarnations of the same class, one per realm. In the case of a userland class, there's only ever going to be one incarnation, as there's no mechanism to recreate the same userland class across multiple realms.

So, for us outsiders, brand-checking is a single operation, there's no magic dancing around the inheritance hierarchy that we can observe (at least, according to my extra-specific definition of brand-checking here). It all just works as expected. If the spec chooses to implement this behavior with some inheritance magic, so be it, but at least our mental model will be a simple and intuitive one.

On the other hand, what would be extremely unfortunate is if every userland class that's currently inheriting a built-in class automatically received an incorrect, mental-model-breaking implementation of Symbol.matcher. In such a scenario, perhaps the implementations doesn't have to figure out how to make userland and built-in classes play nicely with inheritance, but now us outsiders will have a mental model that basically goes "if it's a built-in Error class it does a brand-check and if it's a userland Error class don't use it with pattern-matching because it will look like it'll work, but it actually won't work properly, unless there's specific documentation saying that you can do this", i.e. now we mentally think about it as two separate operations depending on what kind of class we're operating on.

I'm sure if we went with route #3 alone, future me will be second-guessing every time I see a userland error used with pattern-matching when reading other people's code, and I'll want to go look at the definition or the documentation to make sure it's actually being used properly and that we don't have silent bugs in the code.

(Subclassing is, itself, often bad, but that's something every coder needs to learn for themselves, apparently.)

Yes, I do agree here, so I feel less bad if, for example, custom array classes are inheriting a misbehaving Symbol.matcher, because there's many additions we can make to the array class that will cause subclasses to misbehave - I know the proposed Array.groupBy() method was having issues in this regard as well. It is, however, much more unfortunate with Error classes, as those are often seen as the exception when it comes to inheritence (people who hate inheritance will still subclass the Error class), and it's certainly a common pattern in the JavaScript community to subclass Error.

ljharb commented 2 years ago

You have to do that checking for any interface/protocol with any subclass every time, though.

theScottyJam commented 2 years ago

You have to do that checking for any interface/protocol with any subclass every time, though.

If I'm writing the code, yes, I should check the API documentation and not just assume it behaves a certain way.

If I'm reading existing code, I'm not double-checking everything that's happening in the code base, unless something looks fishy. For example:

const user = new User();
...
writeToLog(`Finished doing operation X for ${user}`);

Something like this looks fishy to me. Is it possible that the User class provides an implementation fortoString() that makes that writeToLog() call work as expected? Sure. But, 9 times out of 10 this isn't what's happening, there is no toString() method and the developer just forgot to access the username property on the user object, so what's really going to happen is that that's going to print "Finished doing operation X for [object Object]". So, if I see code like that in a codebase I maintain, I'm going to spend the time to double-check that the User class really has a toString method, and that this isn't a coercion bug. And this is because the default-inherited Object.prototype.toString() method is garbage for most scenarios, yet it gets accidentally used all of the time anyways.

Same thing will happen if I ever see a userland Error class being used in pattern-matching. I'm going to be left wondering if that userland class actually implemented a custom matcher protocol, or if the developer just slipped up.

ljharb commented 2 years ago

I don't think it would be a good idea to do observable lookups in the builtin matchers, so I still think https://github.com/tc39/proposal-pattern-matching/issues/265#issuecomment-1178612254 would be the only way that could work. We'd also have to implement that - along with a new slot - for every kind of builtin.

That seems complex enough that it might obstruct the entire proposal :-/

treybrisbane commented 2 years ago

class MyError extends Error {}
class AnotherError extends Error {}

match (new MyError) {
    when (${AnotherError}): can this match?
    when (${MyError}): can this match?
    when (${Error}): can this match?
}

if we can support this, it will be good.

FWIW, as a rando developer, I feel that error type matching is one of the primary use cases for pattern matching.

As an example, I frequently write code at the API layer that converts domain errors to JSON responses, and that code is almost always something like "if the error is of this type, return this response, else if it's that type, return that response, ...". Pattern matching would be a big improvement for me here so long as I can easily differentiate different error types.

If the above example didn't work as-is in the final version of pattern matching, I'd personally be very, very sad.

Jack-Works commented 2 years ago

If the above example didn't work as-is in the final version of pattern matching, I'd personally be very, very sad.

I absolutely agree. Subclassing error is a widespread use case, it is worth the complexity of supporting it.

ljharb commented 2 years ago

I've put up #279, on top of #273, which combined use instanceof semantics whenever the original constructor was not a builtin. I strongly dislike instanceof semantics, but I can't think of a more appropriate choice, sadly.

theScottyJam commented 2 years ago

And, to be clear, is that change going to just make subclasses of Error work, or will it allow subclasses of all built-in types? (From my reading, it looks like the former?)

I don't feel overly strong about other built-in types, since I know they're rarely subclasses and really shouldn't be, except that it would be nice for consistency sake to make them all behave the same when subclassed.

Also, I know that, for example, HTMLDivElement is a browser API and not part of the core of EcmaScript, but I suspect browser APIs will follow whatever pattern we set here with their built-in types. And, I know they actually encorage subclassing HTMLDivElement (and all other HTML element classes), as that's the only way to make custom elements. It would be nice if pattern-matching against one of those userland subclasses just worked as well.

ljharb commented 2 years ago

@theScottyJam #273 only makes builtin error subclasses work. #279 would make userland subclasses of any builtin constructor (except the primitives) work.

bathos commented 2 years ago

I suspect browser APIs will follow whatever pattern we set here with their built-in types. [...] It would be nice if pattern-matching against [userland subclasses of HTMLElement] just worked as well.

Yep, Web IDL (at internally create a new object implementing interface) deliberately replicates the same observable behaviors as OrdinaryCreateFromConstructor, just using different spec terminology (because they don’t share the [[Intrinsics]] & %Name% syntax pattern). It seems likely that the [[NewTarget]] slot logic could/would end up incorporated similarly because it was expressly stated in the past that alignment was the intention.

HTML uses a special construct-algorithm override in place of the usual Web IDL steps via the [HTMLConstructor] extended attribute but delegates to the same at step 7.1. Because these constructors-when-used-as-super already have unique handling, there might be a good opportunity there to address the custom element scenario in particular with “automatic” subclass matching (esp. given the agent has privileged knowledge of whether instances are valid; there is effectively a kind of built-in branding for these subclasses). The only other commonly subclassed platform interface constructor I can think of is Event, but I don’t personally think implementing branded matchers is a big burden for that scenario — there’s already a need for special care along those lines when doing subclassing it.

ljharb commented 2 years ago

@bathos i got some feedback that the slot approach won't likely be palatable to implementations, so i'm working on an alternative approach that walks the prototype, supports intrinsics from any realm, and gets .constructor to do so. Both approaches should theoretically Just Work with web subclasses as well - HTML would only have to do extra work if it wanted things to work cross-realm for web builtins.

tabatkins commented 2 years ago

Yup, the #279 approach will automatically work for WebIDL objects, in fact, and userland subclasses of HTMLElement, since it proto-walks - so long as you don't want to do anything fancy, when (${MyNewElement}): ... will Just Work.

It doesn't even require any new special-cases unless we specifically want cross-window versions of WebIDL-defined classes to also match (which we might, I dunno, but if we do it's a pretty easy tweak to WebIDL to duplicate what #279 is doing for all WebIDL-defined classes as well).

tc39 / proposal-pattern-matching

Error subclasses: how are they checked? #265