A Pragmatic, Not-Really-Typed Errors Proposal

ethanresnick commented 5 months ago

✅ Viability Checklist

[X] This wouldn't be a breaking change in existing TypeScript/JavaScript code
[X] This wouldn't change the runtime behavior of existing JavaScript code
[X] This could be implemented without emitting different JS based on the types of the expressions
[X] This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, new syntax sugar for JS, etc.)
[X] This isn't a request to add a new utility type: https://github.com/microsoft/TypeScript/wiki/No-New-Utility-Types
[X] This feature would agree with the rest of our Design Goals: https://github.com/Microsoft/TypeScript/wiki/TypeScript-Design-Goals

⭐ Suggestion

I know that typed errors have been discussed extensively and rejected — for very good reasons. I don't disagree with any of those reasons. After reading issue #13219 in its entirety, though, I believe I have a proposal that would improve TypeScript’s error handling without introducing any of the drawbacks @RyanCavanaugh outlined when closing that issue.

Specifically, this proposal does not change any of the following core properties of TS’ error model:

The assignability of two function types is unaffected by what errors they might throw. Accordingly, library authors can start throwing a new exception from a function and that is not a breaking change. Also, the errors that a function can throw will never prevent it from being passed to another function.
Soundness is preserved. TypeScript will not misleadingly indicate that the value reaching a catch block is narrower than unknown; instead, the code in a catch block is forced to assume the value could be anything.
Library authors do not need to manually document their functions’ exceptions, or know what exceptions might be thrown by the functions they call.
TypeScript will not force the caller to handle certain types of exceptions. (I.e., no checked exceptions.)

Motivation & Alternatives

To use @RyanCavanaugh’s terminology, there are certain “unavoidable” errors that represent an operation’s rare-but-anticipatable failure cases.

Throughout the ecosystem today — including in the standard library, web APIs, and most third-party libraries — these unavoidable errors are almost always delivered to callers as exceptions (i.e., with throw). Even if I want to use a Result type or union return types in my own code, I’m still going to have to interoperate with lots of external code that throws exceptions.

If my code cares about handling any of these ‘something went wrong’ failure cases, I need to know what exception is thrown in each case (even if only to lift the exception into an Err Result).

As the TypeScript team pointed out, these exceptions often aren’t laid out in a rich class hierarchy, but there is usually some fairly-stable way to identify them (e.g., by their code or name property). Therefore, the primary barrier to handling these exceptions well is that they’re often undocumented, and the documentation that does exist often isn’t exposed in a convenient way (e.g., in an IDE popup).

The primary goal of this proposal, then, is to:

use TS’ inference abilities to dramatically level up the documentation of exceptions across the whole ecosystem. Instead of relying on manually-written @throws annotations — which are often missing, outdated, or incomplete — TS can infer a lot of information about a function’s potential errors and serialize that information out to declaration files so it can be shared across package boundaries; then
make the relevant potential errors more visible to developers (e.g., in IDE popups), both when they call a function and when they're handling an exception in a catch block.

The list of errors the developer will be reminded of cannot be exhaustive, for very practical reasons that @RyanCavanaugh has mentioned. But reminding them about many of the potential errors is possible, and should make for better, more-reliable code than the status quo. Various other commenters made this argument as well, often with good examples.

Why not union or `Result` types?

The Typescript team’s response in #13219 suggested that, instead of TS trying to expose what exceptions might reach a catch block, code should communicate its errors with union return types or Result types, which naturally preserve information about error cases in the type system. But, empirically, these alternatives haven’t taken off, and there are good reasons why:

The standard library can’t be changed, so it’s stuck throwing exceptions.
Third-party libraries could switch to union return types, but doing so would force their users to type test the return value after every operation. Given how often JS code doesn’t care about handling the error cases, library authors are unlikely to want to force this inconvenience onto their users.
Third-party libraries could switch to returning Results, but that would be impractical because JS doesn't have a standard Result type: different libraries would use slightly different Result types; every library would have to explain its Result type to its users; and these various Result types still wouldn’t interoperate well (e.g., in combinators like Result.all).
The code that I write in my application could use a Result type, at least, but even that might not be worthwhile. Because third-party code throws pervasively for ‘unavoidable’ errors, I’d have to adapt all the external code I rely on to return a Result instead. That is quite hard/cumbersome today, given the lack of exception documentation, which makes it difficult to identify all the exceptions that should be, and can safely be, converted to Err results. This proposal would address that missing documentation. But, even if I'm willing — and, with this proposal, more able — to adapt all the external code I work with to Result, using Result has downsides:
- it imposes a readability tax from nested callbacks/“callback hell” (because JS lacks something like do notation);
- it introduces complexity, as there are now more “tracks” for errors to go down (i.e., thrown errors and promise rejections get supplemented by Err results and Promises that resolve to Err results);
- it makes it harder to onboard new team members, who must learn the Result API and learn which errors should be delivered on which tracks.
For these reasons, which I think explain why Result’s adoption in JS has been fairly limited, it would be very compelling to skip all the adaptation of third-party code and instead have my own application code throw as well — if the primary downside of doing so (i.e., that the thrown exception types become completely invisible to callers) could be addressed. This proposal tackles that, so hopefully it creates a method of error handling within an application that's better than what can be achieved with Result today.

The Proposal

Imagine a type called UnknownError. UnknownError is like unknown, except that it can be put in a union with other types and that union won't be reduced. So UnknownError, UnknownError | SyntaxError, UnknownError | TypeError and unknown are all distinct types, but are mutually assignable to each other, because they all contain a top type.

Now, imagine that every function type has an associated union type of the error types that it can throw. Let's call this the ErrType of the function. This type always contains UnknownError, and UnknownError can't be removed from it; TS adds it implicitly and unconditionally. In other words, every function is always assumed to be able to throw anything, which is why a function's ErrType doesn't end up effecting its assignability.[^1]

Any types added to a function's ErrType union besides UnknownError represent an incomplete set of specific exception types that might be expected when calling that function. This would ideally include most of its "unavoidable" exception types, in @RyanCavanaugh's terminology, and perhaps a few others.

I'll describe in more detail later how this ErrType would be be determined, but, at a high level, it would use a combination of information from declaration files and inference powered by control flow analysis.

Of course, neither CFA nor the declaration files would be perfect or complete — but they wouldn't need to be! For example, when the declaration file for a library (probably one not written in TS) doesn't list some of its exceptions, those exceptions won't be able to show up in the inferred ErrType of functions that call the library. But that doesn't compromise soundness, because UnknownError is still part of the ErrType; it just gives the caller slightly fewer hints about what errors might be thrown. On the other extreme, CFA might add to the ErrType an error that, in context, could never occur — but this also doesn't harm anything.

Putting these ideas together, consider a slightly-modified version of @RyanCavanaugh's example from #13219:

// `throws XXX` means the function's `ErrType` is `UnknownError | XXX`
declare function doSomething(): void throws RangeError;
declare function doSomethingElse(it: string): void throws SyntaxError;

const justThrow: () => void = () => {
  throw new TypeError("don't call me yet");
};

function foo(callback: () => void) {
  try {
    callback();
    doSomething();
    doSomethingElse("blah blah");
  } catch (e) {
    // e: ?
    if (e instanceof RangeError) {
      return 0;
    }
    throw e;
  }
}

foo(justThrow);

The type of e above would be UnknownError | RangeError | SyntaxError.[^2] That is, the type of e in a catch block is simply the union of the ErrTypes of any functions called in the try block, plus the types of any errors thrown explicitly in the try block, but excluding errors that CFA determines can’t escape the try block (say because they’re thrown from a nested try block with its own catch).

While still being sound — UnknownError includes the TypeError that is actually thrown at runtime — UnknownError | RangeError | SyntaxError is more useful than unknown or any. It lets foo give special attention to the anticipatable error cases, some of which it may be able to recover from, without TS pretending like those are the only possible errors.

Meanwhile, the IDE popup that a user would see when hovering over foo would show the inferred definition of foo that would be emitted in a declaration file, i.e. something like foo(callback: () => void): void throws SyntaxError. Seeing throws SyntaxError is a potentially useful reminder to callers to, e.g., devise a fallback value for the error case.

A core strength of this proposal is that it can be adopted incrementally:

existing declaration files can be used on day one without compromising soundness;
improvements to those declaration files over time (to list more errors) won't generate type errors in downstream code, but will make this feature more useful;
mistakes in declaration files (e.g., from misunderstandings around exactly what errors to list or how to indicate what errors propagate if thrown by function-valued arguments) have minimal consequences;
declaration file improvements should come quickly, whenever a TS-based library is recompiled and republished, because a huge amount of this error information will be inferred;
errors thrown at the application-level will be fully tracked from day one (as TS will have access to the application's full source), making for a viable Result alternative.

Finally, this all happens while working with the grain of existing, idiomatic JS code, rather than trying to fight against the throwing that's all over the ecosystem.

The Details

Inferring a function’s `ErrType`

Including UnknownError in every function’s ErrType frees us from the impossible task of creating an exhaustive list of each function’s errors; with that freedom, we can instead ask: What potential exceptions would be most useful to show the developer in a function’s ErrType and would promote good exception handling?

I see a few kinds of exceptions that ought to be treated differently:

There’s the set of “something went wrong” errors that the function’s author clearly anticipated. These will usually be errors that the function throws directly (as opposed to errors thrown by a function it calls). These should obviously be included in the function’s ErrType.
On the other extreme are exceptions that the function’s author clearly didn’t plan for. These are exceptions that, if they were to occur, would occur outside a try block. They often include exceptions that arise if the function’s input doesn’t match its contract/TS types, and errors that TS might have known were possible (e.g., JSON.parse producing a SyntaxError) but that the function’s author assumed couldn’t occur in context. The function’s caller can’t safely handle these exceptions, because the program could be in an invalid state, so making them visible in the ErrType would encourage unsafe code. Moreover, if the function’s author assumed that a potential error wouldn’t occur in a given context (e.g., when constructing a RegExp from a known-good literal string), including that error in the ErrType probably just adds counterproductive noise.
Finally, there are cases where the function’s author anticipated that calling some other function might throw an exception; accordingly, the function: 1) wrapped it’s call to the other function in a try-catch or try-finally, 2) did any cleanup needed to leave the program in a valid state after the exception, but then 3) simply passed the exception through to the caller. In these cases, I think the errors from the called function's ErrType should be included in the main function’s inferred ErrType, because it’s safe for the main function’s callers to catch these exceptions, and doing so might occasionally be useful; as @RyanCavanaugh said, you might do it ”once or twice, for example, to issue a retry on certain HTTP error codes”. Moreover, including them reflects the reality that the function is leaking information about its underlying implementation (by passing these exceptions along).

Based on this classification, I’d propose the following concrete rules for ErrType inference:

Any time throw x appears in a function’s body, the type of x is added to the function’s ErrType, unless CFA determines that that exception cannot escape from the function (i.e., it’s caught and handled within the function). An error thrown from an unreachable branch (e.g. the default case of a switch that’s meant to be exhaustive) is considered unable to escape the function. Broadly, this rule covers the first class of exceptions outlined above.
Any time a function is called in the try block of a try-finally, the called function’s ErrType is added to the outer function’s ErrType (excluding those errors that CFA can verify will not escape the function, thanks to an outer catch). This rule covers the third class of exceptions above. Note that this rule applies only to function calls in try-finally statements with no catch; in try-catch or try-catch-finally statements, an exception can only escape the statement if it’s thrown explicitly from the catch or the finally block, so it will fall under the first rule.
The ErrTypes of all other functions called within a function are not added to that function’s ErrType. This covers the second class of exceptions above. There is one special case here: never-returning functions are assumed to be called only for the errors they throw, so they’re treated according to rule 1 (ie, as though an explicit throw had occurred at the point where the never-returning function was called, and the type of the thrown value was the never-returning function’s ErrType).

In addition, I’d propose the following bit of new syntax: someFn() throws XXX. This syntax makes it more ergonomic to express the rare-ish case where (some of) the errors thrown by someFn ought to contribute to the calling function’s ErrType.

I’ll use this syntax in the examples below for brevity, but it ultimately doesn’t add new capabilities and could be omitted; from the perspective of ErrType inference, it’s simply sugar for try { someFn() } catch(e) { throw e as XXX; }.[^3]

The example below, adapted from the TS homepage, demonstrates these rules:

export function updateUser(id: number, update: Partial<User>) {
  const user = getUser(id)
  const newUser = { ...user, ...update }
  saveUser(id, newUser)
}

function saveUser(id: string, value: User) {    
  localStorage.setItem(`lastmodified.${id}`, new Date().toISOString());
  localStorage.setItem(id, JSON.stringify(value));
}

function getUser(id: string) {
  const user = localStorage.getItem(id);
  return user ? JSON.parse(user) as User : undefined;
}

In terms of ErrType inference:

The TypeError that JSON.stringify can throw, and the SyntaxError that JSON.parse can throw, are not reflected in the ErrType of saveUser and getUser, respectively; the author assumes, fairly reasonably, that these calls will not throw in context, and TS takes their word for it. Accordingly, TS doesn’t clutter up the ErrType with those unlikely errors, which, in the general case, would be unsafe to handle anyway if they did occur.
For the same reasons, saveUser’s ErrType doesn’t include the QuotaExceededError error that setItem can throw.

The author’s assumption that setItem won’t throw a QuotaExceededError is less justified, though, and the possibility of this exception a reveals a potential bug: in saveUser, storing the last modified date could succeed, but then storing the corresponding data could fail, if storing the date filled up website’s storage quota.

Hopefully, from day one, this proposal would make such a bug less likely, as the built-in declaration for setItem would look something like this:

setItem(key: string, value: string): void throws DOMException<"QuotaExceededError">

An author, seeing that in their IDE popup (and/or possibly aided by some lint rules), might be sufficiently reminded of this failure possibility to rewrite their code to avoid it.

If the code were rewritten like:

export function updateUser(id: number, update: Partial<User>) {
  const user = getUser(id)
  const newUser = { ...user, ...update }
  saveUser(id, newUser) throws UserSaveFailedError
}

function saveUser(id: string, user: User) {    
  try {
    const val = JSON.stringify({ date: new Date(), user });
    localStorage[id] = val;
  } catch(e) {
    // e: UnknownError | TypeError | DOMException<"QuotaExceededError">
    // TypeError comes from JSON.stringify, while the 
    // DOMException shows that ErrTypes apply to setters too
    throw new UserSaveFailedError(id, { cause: e });
  }
}

// ...

Then, ErrType inference would include the explicitly-thrown UserSaveFailedError in saveUser’s ErrType. It would also be included in updateUser’s ErrType, thanks to the use of throws UserSaveFailedError. But, as before, the potential error from JSON.stringify would not be explicitly part of the ErrType.

These inference rules could certainly be made more complicated, which would allow them to do the "right thing" more often on existing, real-world code, at the cost of the rules becoming harder to explain and learn. I think that's likely to be a bad tradeoff, but I'd want to see the results of these rules on much more real world code before saying that confidently.

Annotating a function’s `ErrType`

In this proposal, a function's ErrType is always inferred when its implementation is present; it is not legal to annotate the ErrType of a function in these cases. E.g., the following would not be allowed:

 // both are illegal ErrType annotations
 function x(): void throws XXX { /* body here */ }
 const x = (): void throws XXX => { /* body here */ }

Removing the ability to explicitly annotate a function implementation’s ErrType removes the large maintenance burden that would be required to keep manually-authored throws annotations up-to-date or as complete as what would've been inferred. That drudgery is part of why checked exceptions have failed in other contexts (the inferred error lists can get quite long) so it’s important to avoid it.

Admittedly, this restriction is inconsistent with the rest of TS (where an explicit type annotation can throw away precision relative to what was inferred). However, the fact that any inferred ErrType would be assignable to any explicitly-written ErrType (by the logic of UnknownError) means it’d be easier for these manually-written annotations to silently come out of sync than it would be for other annotations. Moreover, if manually-annotated ErrTypes are prohibited from the beginning, that could always be relaxed later if it proves annoying or counterintuitive; but, of course, the reverse is not true.

Additionally, if there are (rare) cases where it’s deemed critical to see a function’s ErrType directly in a TS source file’s text (i.e., without needing an IDE), a number of escape hatches would be available, based on the rules I propose below for where throws annotations would be allowed. E.g., one could write

const x: () => void throws XXX = function() { /* ... */ }

This would be annotating the type of the x variable, not the function expression. The logic of UnknownError dictates that this assignment should always succeed — although, a la excess property checks, heuristics could be added here to flag this assignment if XXX looks off; see details below.

Type Definition/Declaration Syntax

Declaration files and declare statements obviously need a way to record a function's ErrType, to carry this information across package boundaries.

It seems sensible that the same syntax should be usable in every other context where a type definition is allowed. Therefore, if declare const foo: () => void throws TypeError is valid, I'd expect type Foo = () => void throws TypeError to be valid too.

Because every ErrType always includes UnknownError, a function that annotates a parameter as type Foo above (rather than just () => void) is indicating that it might give special meaning to TypeError errors and be prepared for them to be thrown; it's not indicating that the function it accepts can only throw TypeError.

Similarly, an interface that includes a property of type Foo is advising implementers of the interface to throw a specific error, and consumers of the interface to handle it. But, again, a function doesn’t have to throw this error (or only this error) to satisfy the interface.

Implicit in all the syntax examples given so far is that TS would never emit UnknownError in a throws annotation (or show it in an IDE popup), as it’s implicitly present for every function. If a user manually writes UnknownError in a declaration, it has no effect.

If a function’s ErrType is only UnknownError, then, the function's type would canonically be written exactly as it appears today — i.e., () => void is simply shorthand for () => void throws UnknownError. This preserves backwards compatibility.

Parametric `ErrType`s

Any proposal for typed errors is gonna face the demand for those types to be generic/parametric. User-land versions of map, for example, or the example foo function shown above, propagate errors thrown by their callback. Accordingly, this proposal envisions that normal type parameters can be used in a throws clause.

Here’s the original foo example annotated with a type parameter for the callback’s ErrType:

// NB: the type annotation on `justThrow` above was removed, so its inferred type
// is now `() => void throws TypeError`, as opposed to `() => void`.
const justThrow = () => {
  throw new TypeError("don't call me yet");
}

function foo<E>(callback: () => void throws E) {
  try {
    callback();
    doSomething();
    doSomethingElse("blah blah");
  } catch (e) {
    // e: UnknownError | E | RangeError | SyntaxError
    if(e instanceof RangeError) {
      return 0;
    }
    throw e;
  }
}

foo(justThrow);

Hovering over foo(justThrow); would now show a concrete instantiation of foo's type, like foo(callback: () => void throws TypeError): void throws TypeError | SyntaxError.

The inferred ErrType of foo would be: UnknownError | Exclude<E, RangeError> | SyntaxError.

The rules for this inference are roughly:

When any type parameter is inferred, it would be inferred with UnknownError excluded from the source types used to infer it. Therefore, the fact that justThrow’s ErrType includes UnknownError doesn’t automatically add UnknownError into the inferred type for E. (This would become relevant if E were also used as an argument’s type.) Instead, UnknownError is removed from the types used to infer E, then E is inferred as normal, and then UnknownError is automatically added back into every ErrType at the end of the process.
The Exclude<E, RangeError> is automatically generated by CFA, which observes the types of errors that are not re-thrown.

The interaction between generics and the logic of UnknownError can lead to some weird results. For example:

const x: <T>(it: T) => T throws T = function<T>(it: T) {
  throw new RangeError();
}

This assignment is allowed because the inferred ErrType of the function expression would be UnknownError | RangeError, while the ErrType of x is UnknownError | T, and the logic of UnknownError makes these assignable regardless of T’s type. This is slightly weird, in that the error thrown by the function actually has no relation to its argument, but I don’t think it’s a dealbreaker.

Async Error Handling

This proposal is easy to generalize to async error handling: in the same way that a function type has an associated ErrType, a Promise would have an associated ErrType representing the errors it could reject with. As with functions, this type would always implicitly include UnknownError, such that the ErrType of a Promise does not effect its assignability to other Promise types.

When inferring the ErrType of the Promise returned from an async function, the same rules would apply as for synchronous functions, with the additional rule that the ErrType of any returned Promise would be included in the function’s ErrType.

The syntax for where/how to write the Promise's ErrType could be bikeshed extensively. But the discussion above of ErrType type parameters gestures at one way this could look: the Promise type could have a second type parameter that holds its ErrType (excluding UnknownError).

In that case, a version of foo with an identical body, but just marked async, would be declared as:

declare foo<E>(callback: () => void throws E): 
    Promise<void, Exclude<E, RangeError> | SyntaxError>

Similarly, Promise.prototype.then would be declared as:

interface Promise<T, E> {
  /* ... some overloads omitted ... */
  then<TResult1 = T, TResult2 = never, E1, E2>(
    onfulfilled: (value: T) => TResult1 | PromiseLike<TResult1, E1> throws E1,
    onrejected: (reason: E) => TResult2 | PromiseLike<TResult2, E2> throws E2
  ): Promise<TResult1 | TResult2, E1 | E2>
}

In that declaration, the onfulfilled and onrejected callbacks use E1/E2 both in the ErrType of the PromiseLike and in a throws clause, since the callbacks can return a rejected promise or throw synchronously. Also, note that the reason parameter of onrejected is now typed (soundly, thanks to the inclusion of UnknownError).

However, any type parameter that occurs as the second type parameter in a Promise/PromiseLike would need to be treated in a special way, namely:

UnknownError would need to always be implicitly added to its final type;
for symmetry with the prohibition against writing a throws annotation for a function’s body, users mentioning Promise in a function’s return type annotation would have to leave this parameter out;
the value for the parameter would then have to be inferred using the ErrType inference rules, rather than the rules for normal type parameter inference.

This special casing could be hardcoded in the compiler or — especially if there are user-land versions of PromiseLike that would need to work as well — it might instead be worth introducing some new keyword like rejectswith, as in:

declare foo<E>(callback: () => void throws E): 
    Promise<void> rejectswith Exclude<E, RangeError> | SyntaxError

Alternatively, these special type parameters could have a special marking, which, for consistency, could also be required on type parameters that are used in a throws clause. For example, perhaps these parameters would need to be prefixed with error, as in:

interface Promise<T, error E> { /* ... */ }

// error is needed on the declaration of E because E is used in a `throws`
// and in the error-marked parameter of the Promise type
declare foo<error E>(callback: () => void throws E): 
    Promise<void, Exclude<E, RangeError> | SyntaxError>

Exhaustiveness Checking

In languages with a Result type, it can be useful for the compiler to be able to check that the consumer of a result has handled all possible errors (enumerated in the Result’s error type parameter). In this proposal, that would equate to the compiler checking that all the non-UnknownError portions of an ErrType were handled (or re-thrown).

However, the obvious problem with exhaustiveness checking is that it turns the addition of a new error type into a breaking change, which would probably not be a good thing, especially at first: every improvement to a legacy declaration file (to add missing errors) would lead to exhaustiveness checking errors in consumers of the declarations. While that would force the consumer to ask: “should I do something with this error type I'm newly-aware of?”, it would also make library minor version (or @types package) upgrades more involved/time-consuming.

Therefore, I doubt that exhaustiveness checking should ever be on globally or by default.

However, with this proposal, there could be a way for users to opt-in to exhaustiveness checking within individual catch blocks, consistent with TS's existing exhaustiveness checking idioms. One approach might be:

declare function doX(): void throws InvalidUrlError | InvalidResponseTypeError

try {
  doX()
} catch(e) {
  if(e instanceof InvalidUrlError) { ... }
  if(e instanceof InvalidResponseTypeError) { ... }

  // opt-in to exhaustiveness checking by removing `UnknownError`
  // and having TS validate that what's left over is `never`.
  throw (e as ExcludeUnknownError<typeof e> satisfies never)
}

Note that ExcludeUnknownError would be a new, built-in type that just removes UnknownError from a type. This is needed because Exclude<T, UnknownError> would result in never for any type T, which isn't what the user intends. (That would happen for the same reasons that Exclude<T, unknown> always results in never: UnknownError is a top type.)

This is a somewhat clunky way to get exhaustiveness checking that requires some advanced understanding, but that may not be a bad thing if the idea is for people to use this feature only rarely, where they're sure they really want it, in critical parts of a codebase that is especially error conscious.

Details of `UnknownError` and changes to function types

UnknownError | unknown should probably reduce to UnknownError.
When applying a type assertion to a type that contains UnknownError, it might be useful to remove UnknownError from the types on both the LHS and RHS of the type assertion before applying TS’s usual “do the types overlap” check to decide whether the cast is allowed. Because the someFn() throws SomeError syntax would be equivalent to try { someFn() } catch(e) { throw e as SomeError; }, this rule would mostly serve to sanity check that SomeError is related to someFn’s ErrType.

I think this rule would likely be helpful, even though it risks a bit of breakage as declaration files are updated. As with the overlap check on casts today, it could be circumvented by casting to unknown first.

I haven’t fully thought about how functions having an ErrType would effect type inference and contextual typing. Some examples:

const x = (() => {
throw new RangeError();
}) satisfies () => void

const x2 = (() => {
throw new RangeError();
}) satisfies () => void throws SyntaxError

declare function id<T extends () => void throws SyntaxError>(it: T): T;

const x3 = id(() => {});
const x4 = id<() => void throws RangeError>(() => { 
throw new Error(); 
})

declare function id2<T extends () => void throws SyntaxError>(
v: { x5: T }
): { x5: T };

const { x5 = () => { throw new TypeError(); } } = id2({ 
x5() { throw new RangeError() } 
});

// Here, the non-UnknownError portion of the ErrType in T's
// constraint is a super-type of TypeError + RangeError below.
declare function id3<T extends () => void throws Error>(
v: { x6: T }
): { x6: T };

const { x7 = () => { throw new TypeError(); } } = id3({ 
x7() { throw new RangeError() } 
});

declare function id4<T>(
it: (x: T) => void throws T
): (x: T) => void throws T

// Does `x8` now require a `TypeError` as its first argument?
const x8 = id4(() => { throw new TypeError(); })

I’m somewhat confident that the ErrType of x should be RangeError, as writing x that way would presumably be an alternative to const x: () => void = () => { /* ... */ }, which would throw away the inferred ErrType of the RHS (because the ErrType of () => void, as shorthand for () => void throws UnknownError, is simply UnknownError).

Beyond that, I don’t know what the right answers are here, both because I don’t understand TypeScript well enough — including to know what would be easiest to implement — and because figuring out the desired behavior would probably require looking at a lot of real-world code.

TL;DR

JS/TS code does, and will continue to, throw lots of exceptions, including ones that can be usefully caught and recovered from in code that wants to be resilient.
TS can help developers better identify these errors, esp. the "something went wrong" sort, without introducing unsoundness and without the whole ecosystem needing to document every exception first.
TS, through its type inference abilities and its market share, is in a unique position to make exception types better documented, in an automated way, and make this information available to developers in IDE.
Robust, automatic tracking of thrown exception types might create a new "best option" for application-level error handling. It would allow normal, thrown exceptions to have many of the benefits of Result, without devs having to take on the Result's many downsides (i.e., callback hell, extra error tracks, and needing to adapt all third-party code into Result-returning code).

Footnotes

[^1]: This proposal assumes that a function's potential exceptions are not known exhaustively, so it includes UnknownError in every function's ErrType. However, some commenters in #13219 wanted to be able to assert that a function would only throw particular exceptions (often, in order to require that a function passed as an argument would throw no exceptions). I think the use cases for this, and the circumstances in which a function author can actually know the full set of the function's exceptions, are somewhat limited. However, if there are compelling use cases for this down the road, additional syntax could be added to create function types whose ErrType does not automatically include UnknownError. These new function types would be assignable to all previously-existing function types (which would include UnknownError), and in that sense be backwards compatible.

[^2]: Today, almost all the built-in Error types have structurally-identical definitions. For this proposal to be useful for standard library functions, common errors like TypeError and SyntaxError would have to be made structurally distinct, possibly through the addition of some brand symbol. All the examples in this post assume that these errors have been given distinct TS types.

[^3]: This syntax could presumably be used on getters too (i.e. obj.someProp throws XXX), but setters would have to use the longer, unsugared version.

🔍 Search Terms

error types, typed catch, error documentation

fatcerberus commented 5 months ago

Today, almost all the built-in Error types have structurally-identical definitions.

Outside of custom error subclasses (which users can make structurally distinct themselves if they need to), is there any real use case for distinguishing between, say, TypeError and ReferenceError? Unlike, say, C#, the built-in error classes in JS are very general and often fall at awkward boundaries. In other words you probably wouldn't ever write in JS

try {
    // do a thing
}
catch (e) {
    if (e instanceof TypeError) {
        console.log("invalid data was received by some function");
        // error handled, recover
    } else {
        throw e;  // not a type error, rethrow
    }
}

because you might get a RangeError instead of a TypeError due to the exact same cause (bad data passed to function, e.g.). So IMO it doesn't really matter that they're not structurally distinct.

DanielRosenwasser commented 5 months ago

Without a fully-thought out response, the way I've often alikened something like this is to #26277 ("open-ended union types"), where there is some partially-known set of constituents that you want to handle (for some definition of what "handle" is).

fatcerberus commented 5 months ago

Open-ended unions sounds like what people are often shooting for when they try to write stuff like "foo" | "bar" | string (if not to aid with completions).

ethanresnick commented 5 months ago

@fatcerberus My instinct is that there are at least some cases where it’d be useful to distinguish between the different built-in error types, but I don’t think that making them structurally distinct is a requirement or blocker for this proposal. If this proposal gets any traction, I imagine there’ll be a lot of testing on real-world code before anything lands, and that real world testing should make it clearer whether making the built-in errors distinct is actually worth it.

ethanresnick commented 5 months ago

@DanielRosenwasser I hadn’t seen that issue, but this proposal would absolutely leverage open-ended union machinery if it were to exist! Obviously, that machinery alone isn’t enough to cover all the functionality here (e.g., for ErrType inference), but it's very complimentary.

HolgerJeromin commented 5 months ago

I really like this proposal.

declaration file improvements should come quickly, whenever a TS-based library is recompiled and republished, because a huge amount of this error information will be inferred;

As far as I understood you want to put the inferred types automatically into generated .d.ts files (like declare function doSomething(): void throws RangeError;). This is a new syntax so old typescript compilers will not be able to parse these files. Not every consumer of .d.ts files are quick in upgrading, so IMO we should be able to disable this emit. Our customers are often consuming our .d.ts files using typescript 3.9.x :-( We guarantee this compatibility with a CI build step. As long as we do not use new syntax (like declare function fancyGetter(name: `get${string}`): number;, new in TS4.1) in an API this is possible.

fatcerberus commented 5 months ago

IIRC from what maintainers have said, backward compatibility for .d.ts emit is not guaranteed in general; you're expected to have a downleveling step in your toolchain if you need your declaration files to work with older TS versions than the one you're using.

ethanresnick commented 5 months ago

@DanielRosenwasser If this proposal seems promising, what would the next step be here? Is it the type of thing where the TS team would want to see more community input before anything else? Or is the (long) previous discussion in #13219 already a signal of sufficient community demand? Are there specific issues with the proposal that I could maybe help to address? Or is it more a matter of the TS team talking internally first to figure out how/whether this would cohere with other features TS might add (like open-ended unions), how valuable error typing would be, how hard it'd be to implement, etc?

DanielRosenwasser commented 5 months ago

I think we'd have to allocate some time among the team to get a sense of everything you just listed (e.g. difficulty of implementation, feel, future-compatibility against other possible language features, etc.). Part of it is just a timing/availability thing.

ethanresnick commented 5 months ago

Got it; makes total sense.

Whenever you and the team do have time to talk about it, I’m excited to hear what the outcome is :)

RyanCavanaugh commented 4 months ago

What we've discovered trying (multiple times!) to implement non-workarounded completion lists for open-ended unions is that the instant something doesn't have type system effects, it tends to disappear nearly immediately, or cause huge problems.

Example: having two types () => throws A (call it TA) and () => throws B (call it TB) be mutual subtypes seems fine, but it isn't. It means, for example, that given const x = e ? TA : TB, x has to be one of those types, but can't be a union, so it means TA or TB would get randomly picked. Once it gets randomly picked, then it's a huge hazard because people will inevitably try to fish out the throws type with something like type GetErr<F> = F extends (() => unknown throws infer E) ? E : never, so then type K = GetErr<typeof x> randomly gets you A or B and causes different errors to appear or not appear.

Putting in place type system features which never affect assignability is thus very dangerous, because it means that the behavior of your program becomes chaotic, or you're not allowed to use that feature in any way where it's observable, which becomes its own can of worms. Like you might just say "Oh, well, it's just illegal to write throws infer E, but that doesn't solve the problem, because you can do a trivial indirection:

const Throws<T> = () => unknown throws T;
type TA = Throws<TA>;

type Unthrows<T> = T extends Throws<infer E> ? E : never;

where now you have a situation where you can't do nominal inference of Unthrows<TA> because it would cause the illegal observation of the throws type. So now you have to have a separate system to track "type parameters where it's legal to observe them" and "type parameters where it's not legal to observe them" and come up with sensible error behavior anyone someone tries to cross that invisible line. You can keep stacking on more ad-hoc rules to try to ban this, but you'll always either end up at some extremely inconsistent (and defeatable) set of rules, or prevent the feature from being used in the way it was originally intended in the first place.

Ultimately this doesn't sound like a type system feature, for basically the same reason that type system features to create documentation descriptions doesn't exist. At the end of the day you can't add things to the type system that don't do anything because people will still find ways to observe the thing that's happening, and without some well-reasoned ordering of subtypes, that will cause a lot of unpredictable behavior.

ethanresnick commented 4 months ago

@RyanCavanaugh I don’t know enough type theory to fully engage the issues here, but let me take a stab at a constructive response — and forgive me if I miss things that ought to be obvious.

What I take you to be saying, essentially, is: because so much of TS’ underlying logic relies on types being in a hierarchy/partial order, the concept of "mutual subtypes that are nevertheless distinct" is somewhere between “very tricky to implement” and “conceptually incoherent”. Since open-ended unions try to make mutual subtypes out of "blue" | string and string, they’re gonna run into lots of problems. The same thing applies to trying to put unknown/UnknownError into a union that doesn’t reduce, and to building mutually-assignable function types containing that union.

Do I have all that right?

If so, I guess I see three directions for trying to advance the DX goals of this proposal (which are very valuable imo):

Stick with the prior issue's conclusion, that this should be implemented outside the type system. Thinking open-mindedly, maybe "outside the type system" wouldn't have to mean "outside of TypeScript", given that inclusion in TS might be critical for adoption.

However, beyond adoption, I think the fundamental issue is that some type system involvement probably is necessary to make this all work. As much as the UnknownError/open-ended union portion of this proposal is a weird fit for the type system, the work to figure out the non-UnknownError constituents of the ErrType requires very standard type system machinery, especially for generic error types. Consider promiseThatCanRejectWithX.then(functionThatCanThrowY). Any analysis should conclude that that's a promise whose known rejection types include X and Y. Similarly, on your example of const x = e ? TA : TB, the analysis has to conclude that the known error types of x include A and B. That all feels very type system-y.
Try again to somehow solve the open-ended union problem in a way that's consistent/coherent/predictable. I accept that this is hard or maybe impossible (but seems very interesting, so I might try to help). At the very least, it probably requires devising a bunch of new typing/type inference rules, even if @DanielRosenwasser's recent idea here is in the right direction.
Try to come up with a simpler, and more-targeted (albeit less complete) solution that would be suffice for this proposal. I'm inclined towards this approach.

Here’s a sketch of one idea I’ve been noodling on in that direction, which I’m hoping you can sanity check:

As in the OP, every function type (and promise type) would have an associated type for its errors. However, rather than trying to make this type represent every possible error (with an open-ended union), it would represent just the known errors. Let’s call it the KnownErrType. In () => void throws A, A would now be the KnownErrType.  Open-ended unions are totally gone.
The error variable that reaches a catch block would continue to be typed as unknown. But, these catch block variables would have a special tag that effects how their type is shown in IDE popups. Specifically, the IDE popup would show both the variable's actual type, and its known error types (derived, as per the OP, from CFA on the corresponding try block). Maybe this is with some sort of newly-formatted popup; maybe it's a hack where the language server reports that the type of the variable is unknown | KnownError1 | ... | KnownErrorN, even though that's obviously equivalent to unknown and the "real" type would simply be unknown. If the type of the catch block variable narrows, the same narrowings would apply to the known error types. So, in other words, catch block variables would have a type at each location and a set of known error cases at each location, and the IDE popup would show both.
Since we still want to keep function types assignable to each other regardless of errors, the KnownErrType of a function/promise type would be ignored in almost all circumstances; i.e., types with different KnownErrTypes wouldn’t be distinct for the purposes of almost all compiler algorithms. The idea, basically, is to lean into the idea of the KnownErrType being a kind of documentation/metadata, and keep it out of most type-system logic. As mentioned above, though, it can't be kept out of all type system logic (e.g., it needs to be elicitable in conditional types with throws infer E; when TS infers specially marked error type parameters per the OP, for parametric error cases; etc.). But the idea would be that there are specific cases where a function's known errors need to be elicited and, in these cases, the KnownErrType is used and behaves just like a regular type, as it no longer has the weird UnknownError in it.
The only real requirement, then, is that TS preserve/update a function's KnownErrType as needed, so that it can serve its ultimate documentation purpose in a catch block. To that end, when two function types (or two promise types) “combine” to produce a new type, their KnownErrTypes combine as well to produce the KnownErrType of the new type. Specifically:
- When subtype reducing two function/promise types in a union, the resulting type’s KnownErrType would be the union of the two original type’s KnownErrTypes.
- When unifying two function or promise types into a new function or promise type, the resulting type’s KnownErrType would be the union of the two original type’s KnownErrTypes.
- Maybe there are other cases too? Again, I don’t know enough type theory or TS compiler details to know all the ways that the KnownErrType information might get lost. But I hope the basic idea here is simpler, in that two function types which differ only in their known errors would no longer have to be preserved indefinitely as distinct types that are mutual subtypes; instead, they can immediately be collapsed into one type (but preserving the original types' known error metadata).

  So, in your example of const x = e ? TA : TB, the unification of () => void throws A and () => void throws B would always be () => void throws A | B.

ethanresnick commented 4 months ago

The solution above — treating the catch block variable specially — isn't really a complete solution. E.g., it doesn't support communicating the known error types to a promise.catch callback. But it probably gives the vast majority of the real world DX value, and it doesn't block a more comprehensive open-ended union solution in the future, which would change the type of the catch block variable from unknown to an open-ended union containing/bound by unknown.

It also probably would make it harder for code to opt-in, on a case-by-case basis, to a TS check that all the known errors had been handled. (Though a compiler flag or a lint rule could enable check that globally for codebases that really want it.)

RyanCavanaugh commented 4 months ago

To that end, when two function types (or two promise types) “combine” to produce a new type

It's notable that there is no existing "combine" mechanism where two types get synthesized into a new type. For example, let's say you have

declare function choose<T>(a: T, b: T): T;

if we call choose(throwsA, throwsB), generic inference needs to produce a single output type T from the presented candidates (throwsA and throwsB). Synthesizing a new type from the two of them has never been done before, because there's never been a correct reason to do it before. This invariant has been useful and correct for the entire lifetime of TypeScript so changing it now should have some really strong justification.

Obviously nothing is impossible and invariants have been removed in the past, but breaking this invariant only for the sake of making catch variables get better IDE pops-ups is not really compelling from a trade-offs perspective.

I'm still not clear on what's being gained from doing this in type space, where it has no type effects and will misbehave in all sorts of ways, instead of doing this in the JS Doc space? If the feature isn't going to work well under indirection anyway, then it just seems obviously fine to implement this through walking up to JS Doc declarations the same way we do to provide documentation tooltips.

ethanresnick commented 4 months ago

I'm still not clear on what's being gained from doing this in type space, where it has no type effects and will misbehave in all sorts of ways, instead of doing this in the JS Doc space? If the feature isn't going to work well under indirection anyway, then it just seems obviously fine to implement this through walking up to JS Doc declarations the same way we do to provide documentation tooltips.

I read your question in a couple different ways.

The first read, which probably isn't what you mean, is: "Could this be done using the JSDoc annotations that code actually has today?" I think the answer there is pretty clearly no: the @throws annotations on existing code are woefully incomplete and/or out of date, and probably always will be if developers are being asked to maintain these annotations manually.

So then the second read is: "Could some analysis tool automatically identify a function's anticipatable errors and serialize that set of errors to JSDoc? Then, that JSDoc could be consumed by the TS language server to power IDE popups."

I'm assuming that the analysis to identify each function's anticipatable error types needs to happen on TS source code, rather than on the published, possibly-minified JS code: by the time the code is converted to JS, it seems like too much type information is lost.[^1] So, if the analysis is happening on the raw TS source, there needs to be some way to serialize the results of the analysis and publish it with the compiled code, and the question is just: is JSDoc adequate for that?

My first thought is that, if Typescript is doing this analysis (which I think it should be for reasons discussed below), then it seems a little weird for TS to modify a function's JSDoc on emit. There could also be conflicts if the function already has @throws annotations.

The bigger limitation of JSDoc is that it doesn't support any parametricity. E.g. in,

function f(arr: unknown[]) {
  try {
    return arr.map(xxxx);
  } catch (e) { /* … */ }
}

Clearly, the known error types for e should be the same as the known errors for xxxx, but there's no JSDoc that one can write for map to make that happen.

But maybe that's not a show-stopper. Maybe a workaround would be to say: any time f calls a function and passes it a function as an argument, assume that function argument is gonna get called and its thrown errors are gonna propagate. So e would end up with xxxx's error types as possibilities, even though TS knows nothing about map's handling of errors or whether it even calls xxxx. Something analogous would happen when a Promise is passed as an argument. (I.e., TS would assume that the promise is awaited in the function its passed to, and that its rejection propagates.)

There are lots of cases where that heuristic won't work right, but it's a conservative and maybe-not-horrible assumption?

Still, part of what I was going for with my original syntax was that it would be possible to be a bit more precise about things like this. E.g., to do:

interface PromiseRejectedResult<E> {
  status: "rejected";
  reason: WithKnownCases<E, unknown>; // open-ended union E | uknown
}

type PromiseSettledResult<T, E> = PromiseFulfilledResult<T> | PromiseRejectedResult<E>;

interface PromiseConstructor {
  // all() propagates errors. `GetRejectsWith` extracts a promise's known rejection types.
  // The heuristic above would give identical behavior for Promise.all, but wouldn't work for allSettled.
  all<T extends readonly unknown[] | []>(values: T): 
    Promise<{ -readonly [P in keyof T]: Awaited<T[P]>; }, GetRejectsWith<T[number]>>;

  // allSettled() preserves known errors in the PromiseSettledResults, but removes all known
  // errors on the returned Promise, by leaving off its second type parameter (which 
  // represents the known rejection types and would default to never).
  allSettled<T extends readonly unknown[] | []>(values: T): 
    Promise<{ -readonly [P in keyof T]: PromiseSettledResult<Awaited<T[P]>, GetRejectsWith<T[P]>>; }>;
}

When you say "if the feature isn't going to work well under indirection anyway...", I guess I was hoping that it could work well under indirection, and that was my motivation for a more-complex syntax than what JSDoc supports. My "when types combine..." proposal was an attempt, without really knowing how TS is implemented, to preserve the ability for error typing to work under indirection at least reasonably well (certainly better than it would with JSDoc).

These examples with promise error typing and .map gesture at what I meant by:

some type system involvement probably is necessary to make this all work. As much as the UnknownError/open-ended union portion of this proposal is a weird fit for the type system, the work to figure out the non-UnknownError constituents of the ErrType requires very standard type system machinery, especially for generic error types.

But, let's assume that doing anything with errors in the type system is not worth it; that the JSDoc syntax is sufficiently expressive; and that the parametric error cases can be handled well enough with some heuristics. Then...

Some analysis still needs to happen on TS source code before its compiled/published, to identify relevant errors, with the results being serialized to JSDoc comments.
If every library author has to manually enable that analysis for their library — say, by installing some TS compiler plugin — many libraries won't, so consumers won't have access to nearly as much error information.
Therefore, this proposal becomes a request for TS to do that analysis so that all consumers benefit. TS would analyze every function's error handling according to the ErrType inference rules in the OP and serialize the results to JSDoc on emit. And then use this info in IDE popups.

[^1]: E.g., if the code says throw new ErrorSubclass(), and ErrorSubclass has some code property that, in the original TS source is defined as a literal type (to makes the error structurally-distinct), I'm not sure how that literal-ness would get recovered post-compilation. I imagine this problem gets worse if the code uses a helper function or an external library to create its custom error subclasses. Also, minification is gonna mangle error subclass names into total inscrutability. So all this has me thinking that trying to recover nice error types from raw JS feels like a losing battle.

ethanresnick commented 4 months ago

@RyanCavanaugh Any further thoughts here?

microsoft / TypeScript