Synchronize type annotations of JSDoc comments and TypeScript to give users more control over types

RunDevelopment commented 3 weeks ago

Motivation

JSDoc has @param, @returns, and @type annoations that have been used for years in the JS world to annotate the types of function parameters, return types, field types, and more. Right now, WASM bindgen automatically generates @param and @return annotations with types for functions (in JS #4187) or not with skip_jsdoc.

This currently creates the problems that the user-defined JSDoc type annotations are ignored in TS and that WASM-bindgen-generated annotations have to be explicitly disabled to not interfere with the user-defined ones.

Proposed Solution

If the user provides a JSDoc comment for a function, parse the JSDoc comment and:

If a parameter or the return is not documented, generate an @-property with the type annotation for it.
If a parameter or the return is document, but has no type annotation, add the type annotation.
If a parameter or the return is document and has a type annotation, use the annotated type use the annotated type in both JSDoc and TS.

Example:

/// Documentation for the function.
/// @param b This gets its type inferred.
/// @param {100 | 200 | null} c Note that this parameter is *not* optional and does not accept `undefined`.
/// @returns {"success" | "failure"}
#[wasm_bindgen]
pub fn foo(a: &str, b: u32, c: Option<u32>) -> String { todo!() }

// bindings.js
/**
 * Documentation for the function.
 * @param {string} a
 * @param {number} b This gets its type inferred.
 * @param {100 | 200 | null} c Note that this parameter is *not* optional and does not accept `undefined`.
 * @returns {"success" | "failure"}
 */
export function foo(a, b, c) { ... }

// bindings.d.ts
/**
 * Documentation for the function.
 * @param b This gets its type inferred.
 * @param c Note that this parameter is *not* optional and does not accept `undefined`.
 */
export function foo(a: string, b: number, c: 100 | 200 | null): "success" | "failure";

The same idea of syncing types for can also be used for class fields and constants with @type {...}.

This solution not only solves the problem of making sure that typing is consistent between JSDoc and TS, it also:

Makes skip_jsdoc almost unnecessary, since generated annotations no longer interfere with user-defined ones.
Removes the possible error of users accidentally annotating the wrong type when they just wanted to document a parameter.
Allows users to define a custom type for every field, argument, return, and constant using JSDoc.

So not only is this more convenient, ensures consistency, it also gives users more control. The last point is very important, because this fixes #1197, #1591, #1847, #2869, #3107, and #3953.

I want to highlight here, that custom types are defined using a syntax already familiar to JS devs. Any JS dev that has used JSDoc for type annotations before will know what to do without us having to explain anything. Using JSDoc comments also has the advantage that type expressions are not limited by Rust syntax in any way.

Complexity

I think the main challenge for implementing this will be to parse and interpret JSDoc. There is no guarantee that the JSDoc comments will be valid. JSDoc itself is also quite complex, so interpreting it will be tricky even in parsed form.

We might be able to use swc's jsdoc crate for something battle-tested. The main problem with this crate is that pulls in tons of dependencies, and it's entirely undocumented as well. If push comes to shove, we might have to write a simple parser ourselves. (I already worked on JSDoc syntax highlighter and formatter before, so I can do it if need be.)

Backwards compatibility

Since this only affects comments and TS types, I'm not sure if this would be considered a breaking change. I think in most case, with would even be a fixing change, since the TS and JSDoc types finally match.

In any case, I think that invalid JSDoc comments or other problems (e.g. duplicate @param or @return tags) should not cause compiler errors but emit warnings. We can then decide to upgrade those warnings to errors in future releases.

Additional Context

I'm willing to make a PR for this.

I also want to point out this approach of using JSDoc to define typing information naturally extends to other features as well. E.g. generics with @template, custom type definitions with @typedef and @callback, overloads with @overload, and more. By treating JSDoc comments as more than just comments, we have a path for supporting a lot of advanced typing features in a way that is intuitive to most JS/TS devs.

daxpedda commented 1 week ago

My general stance is that wasm-bindgen is a foundational library, while convenience is nice, we can only go so far to provide it. We should only be concerned with enabling others to use wasm-bindgen to achieve any convenience or result as far as possible.

Adding custom JSDoc for parameters and the like definitely sounds in the ballpark for wasm-bindgen, but not parsing JSDoc to make it convenient. Firstly its very important to me that wasm-bindgen stays as lean as possible to make it as maintainable as possible and secondly as a foundational crate adding any dependencies should be avoided as much as possible. Unless its possible to get a working and minimal JSDoc parser without adding any more dependencies (or only minimal ones), in my opinion this would be out of the question.

To that end I would suggest using parameter attributes instead, return type JSDoc comments can be added via function attributes.

Since this only affects comments and TS types, I'm not sure if this would be considered a breaking change. I think in most case, with would even be a fixing change, since the TS and JSDoc types finally match.

So far I've treated TS and JSDoc changes not as breaking changes. We don't have an official policy on this, so I'm fine continuing as we were.

I also want to point out this approach of using JSDoc to define typing information naturally extends to other features as well. E.g. generics with @template, custom type definitions with @typedef and @callback, overloads with @overload, and more. By treating JSDoc comments as more than just comments, we have a path for supporting a lot of advanced typing features in a way that is intuitive to most JS/TS devs.

We don't have to get into this now, but I just wanted to say that I know very little about those things in a JS context. If its relevant, I would appreciate e.g. an explanation or link to one.

RunDevelopment commented 1 week ago

Before I start, I want to point out that syncing has 2 parts to it:

Taking TS types and adding them to an existing (or empty) JSDoc comment, and
Taking types declared in JSDoc and setting them as the TS type from a specific argument/return/etc.

Both require parsing JSDoc comment, but they don't require each other. So even just implementing one of the two would be an improvement IMO.

Also, this isn't incompatible with defining custom types as #[wasm_bindgen(...)] attributes. Quite the opposite. Attributes would benefit from part (1). More on that later

Adding custom JSDoc for parameters and the like definitely sounds in the ballpark for wasm-bindgen, but not parsing JSDoc to make it convenient.

These two tasks (adding JSDoc and parsing it) are related.

Right now, WBG either adds all the JSDoc it wants or nothing at all. This approach means that users documenting their functions will very often conflict with the JSDoc WBG generates. E.g. it's common practice in the JS/TS world to document individual parameters with @param. This makes skip_jsdoc almost mandatory for anyone properly documenting their public functions/methods following common JS practices.

In other words, WBG adding JSDoc is not as useful as it could be, because it doesn't know what it is adding it to. It lacks understanding of existing JSDoc (i.e. parsing it).

This would even be a problem with custom type annotations via #[wasm_bindgen(...)] attribute. Since users documenting their functions would most likely use skip_jsdoc, they would be forced to duplicate the type expression for parameters/returns to have them present in the .js file.

Unless its possible to get a working and minimal JSDoc parser without adding any more dependencies (or only minimal ones), in my opinion this would be out of the question.

That should be possible.

JSDoc syntax is quite simple in principle and designed so that parser don't need to understand every tag. I estimate that a simpler parser could probably be written in around 500~1000 eLOC with no dependencies and a little less with regex. (This includes high-level API functions to modify and print the AST.) That's not nothing, but it's also not too much IMO.

If you think that part (1) of syncing (adding our TS types to existing JSDoc) would be useful, I can implement a simple JSDoc parser and make a PR.

So far I've treated TS and JSDoc changes not as breaking changes. We don't have an official policy on this, so I'm fine continuing as we were.

Sounds good to me. AFAIK, most TS libraries also follow the policy of not seeing (minor) type changes as breaking.

We don't have to get into this now, but I just wanted to say that I know very little about those things in a JS context. If its relevant, I would appreciate e.g. an explanation or link to one.

They are all tags that map to TypeScript features. Almost every feature that has specific syntax in TS has a JSDoc tag to allows you to use that feature in JS. This is TS's strategy to keep adding and expanding features without leaving pure JS devs behind.

Here's the documentation for all JSDoc tags TS supports.

rustwasm / wasm-bindgen