WordPress / gutenberg

The Block Editor project for WordPress and beyond. Plugin is available from the official repository.
https://wordpress.org/gutenberg/
Other
10.52k stars 4.2k forks source link

docgen: remove doctrine #18045

Closed oandregal closed 3 years ago

oandregal commented 5 years ago

doctrine is the JSDoc parser we use in docgen to get structured information out of the JSDoc comments. It was created and maintained by the ESLint people, although it's no longer actively developed. We should migrate out of doctrine for that reason.

The suggested replacement (eslint-plugin-jsoc) uses a different JSDoc parser with support for many other syntaxes (typescript, etc). We should look into trying jsdoctypeparser or any other that's actively developed.

oandregal commented 5 years ago

Some notes:

The plan would be to update the get-jsdoc-from-token exported function to use whatever doctrine alternative we settle on.

oandregal commented 5 years ago

Relevant to this is the research I did at the time on doctrine alternatives (see comment on Feb 22): https://github.com/WordPress/gutenberg/pull/13329#issuecomment-466389286

aduth commented 4 years ago

I started exploring this, initially aiming to use comment-parser, seeing as it's the same in use by eslint-plugin-jsdoc, and on the assumption that we don't really need the type parsing (vs. just the verbatim name). It worked okay, but unlike Doctrine, there's no built-in understanding of "typed" or "named" tags, so these had to be manually adapted from Doctrine. Even then, however, there's some data loss that occurs when trying to reconstruct unnamed tags (like an @example tag, losing whitespace).

This is about as far as I got with the implementation, but while it technically passes the unit tests, it fails in creating identical documentation for many of the packages for the above reason:

(Click to expand get-jsdoc-from-token.js source) ```js /** * External dependencies. */ const parse = require( 'comment-parser' ); /** * Internal dependencies. */ const getLeadingComments = require( './get-leading-comments' ); /** * Set of tags which can assign a name. * * @see https://github.com/eslint/doctrine/blob/0e8eba7/lib/doctrine.js#L64-L66 * * @type {Set} */ const NAMED_TAGS = new Set( [ 'param', 'argument', 'arg', 'property', 'prop', 'alias', 'this', 'mixes', 'requires', 'const', 'constant', ] ); /** * Set of tags which can assume a type. * * @see https://github.com/eslint/doctrine/blob/0e8eba7/lib/doctrine.js#L85-L90 * * @type {Set} */ const TYPED_TAGS = new Set( [ 'throws', 'const', 'constant', 'property', 'prop', 'namespace', 'member', 'var', 'module', 'constructor', 'class', 'extends', 'augments', 'public', 'private', 'protected', 'return', 'returns', 'define', 'enum', 'implements', 'this', 'type', 'typedef', 'param', 'argument', 'arg', ] ); /** * @typedef {Object} DocgenJSDocTag * * @property {string} title Tag title. * @property {?string} description Tag description. Null if empty. * @property {string=} type Tag type, if relevant for tag name. * @property {string=} name Tag name, if of a named tag title. */ /** * @typedef {Object} DocgenJSDoc * * @property {string} description Comment description. * @property {DocgenJSDocTag[]} tags Comment tags. */ /** * Function that takes an Espree token and returns a object representing the * leading JSDoc comment of the token, if any. * * @param {Object} token Espree token. * * @return {DocgenJSDoc|undefined} Object representing the JSDoc comment. */ module.exports = function( token ) { const comments = getLeadingComments( token ); if ( ! comments ) { return; } const parsed = parse( '/*' + comments + '*/' ); if ( ! parsed.length ) { return; } const { description, tags } = parsed[ 0 ]; return { description, tags: tags.map( ( tag ) => { const isNamedTag = NAMED_TAGS.has( tag.tag ); const isTypedTag = TYPED_TAGS.has( tag.tag ); const mappedTag = { title: tag.tag, description: tag.description, }; if ( isNamedTag ) { mappedTag.name = tag.name; } else { mappedTag.description = [ tag.name.trim(), mappedTag.description, ].filter( Boolean ).join( ' ' ); } if ( ! mappedTag.description ) { mappedTag.description = null; } if ( isTypedTag ) { mappedTag.type = tag.type; } return mappedTag; } ), }; }; ```

Curious to align to TypeScript's JSDoc tooling, given that we want to embrace some of this typing syntax (#18838), I discovered theirs is an entirely home-grown solution (some of the source). It makes me wonder though, seeing as we already have to parse the source to find comments and export statements, whether it might be worth using TypeScript both for the parsing of the source and for the parsing of the JSDoc. I haven't yet followed this idea to see how feasible it would be.

aduth commented 4 years ago

It makes me wonder though, seeing as we already have to parse the source to find comments and export statements, whether it might be worth using TypeScript both for the parsing of the source and for the parsing of the JSDoc.

Relevant documentation: https://github.com/microsoft/TypeScript/wiki/Using-the-Compiler-API

Notable here is to feed a source string into ts.createSourceFile:

var ts = require( 'typescript' );
ts.createSourceFile( '', '/**\n * Function description.\n *\n * @param {string} myVariable My variable description.\n */', ts.ScriptTarget.ES2015 );

ASTExplorer also has a TypeScript mode for exploring the AST: https://astexplorer.net/#/gist/34e05c635ea8a8904a4d557afc9dccc8/04f2263c5748d52b7a523809ace596e7f6e31e5e

It's includes full parsing of the JSDoc, including types.

aduth commented 4 years ago

Noting: While the TypeScript parser does parse the complex types, we will need to make sure we can print those in a meaningful way. Rolling something custom here could become very hairy very quickly, given the full scope of supported types.

Consider: https://astexplorer.net/#/gist/4a7169d71b3f44e840f7f57bfbaee5e6/1d0dbdd8e0d27568388e9a5a980402e6d4f1a5ba

VSCode actually evaluates this very nicely:

image

Ideally we can reuse this for our own documented output.

I've found a little bit of a lead in the TypeScript Compiler API documentation, though the documentation is far from exhaustive.

Specifically, we may be able to use parts of the "Using the Type Checker" section:

https://github.com/microsoft/TypeScript/wiki/Using-the-Compiler-API#using-the-type-checker

let program = ts.createProgram(fileNames, options);
// ...
let checker = program.getTypeChecker();
// ...
checker.typeToString( checker.getTypeOfSymbolAtLocation( /* ... */ ) );
aduth commented 4 years ago

Even without a complete substitution of Doctrine, we probably need to find a way to address some of the current problems with the types printing sooner than later. Specifically, we are seeing many instances that docgen will generate a type of null for TypeScript-specific complex types. To me, this is worse than if it simply didn't document the type at all, since it is both incorrect and misleading.

For example, see also: #19520

aduth commented 4 years ago

Even without a complete substitution of Doctrine, we probably need to find a way to address some of the current problems with the types printing sooner than later. Specifically, we are seeing many instances that docgen will generate a type of null for TypeScript-specific complex types. To me, this is worse than if it simply didn't document the type at all, since it is both incorrect and misleading.

Interim fix proposed at #19571

aduth commented 4 years ago

One other possible advantage of using the TypeScript compiler is more in how we already use a different parser (Espree) than that which runs over our own code (Babel). This hasn't been much of an issue thus far, but I just encountered my first issue where a particularly new syntax that we we support through Babel (ES2019 optional catch binding) will throw an error when running npm run docs:build:

export function isURL( url ) {
    try {
        new URL( url );
        return true;
    } catch {
        return false;
    }
}
 ⇒ npm run docs:build

> gutenberg@7.3.0 docs:build /Users/andrew/Documents/Code/gutenberg
> node ./docs/tool/index.js && node ./bin/api-docs/update-readmes.js

url 
SyntaxError: Unexpected token {

Using TypeScript doesn't really address the underlying issue that we have different parsers being used, but we could perhaps assume the TypeScript would be more up-to-date. Alternatively (and as an interim solution), it might just be a matter of keeping the espree dependency up to date, assuming that it's been updated for newer versions of the specification.

aduth commented 4 years ago

Alternatively (and as an interim solution), it might just be a matter of keeping the espree dependency up to date, assuming that it's been updated for newer versions of the specification.

This is unfortunately not as simple as it might seem, since one of the breaking changes introduced in espree@5.0.0 is to remove the attachComment feature, which is pretty critical to how docgen works (or at least is currently implemented). Thus, upgrading to the latest version would require a pretty significant refactor to the code, one where we'd likely emulate how comment attachment worked previously, using the comment (and perhaps tokens) option(s) to infer where a comment occurs in relation to an exported member. I started down this path on Friday, but it seems significant enough to warrant whether it's worth the effort, considering if the alternative to Doctrine would bring with it a different parser implementation as well (e.g. TypeScript).

One thing I noted in the process of this is that we configure the ECMAScript version here:

https://github.com/WordPress/gutenberg/blob/1fb3c386001266ad03ed6d2cd3d9ad8777090028/packages/docgen/src/engine.js#L15

This is particularly relevant for my previous comment where I mention being unable to use ES2019 features. It's pretty clear by this code why this wouldn't be expected to work 😄 I haven't tested yet, but it's possible that Espree versions prior to the breaking 5.0 version might have support for newer versions of ECMAScript, so that we could still use those language features without bumping to the latest Espree version.

sainthkh commented 4 years ago

I'm working on this with #19952.