unifiedjs / unified

☔️ interface for parsing, inspecting, transforming, and serializing content through syntax trees
https://unifiedjs.com
MIT License
4.49k stars 110 forks source link

Define input and output types for transformers #128

Closed remcohaszing closed 3 years ago

remcohaszing commented 3 years ago

This allows to transform this code:

import * as hast from 'hast';
import * as mdast from 'mdast';
import * as unist from 'unist';

const remarkRehype: Plugin<[RemarkRehypeOptions?]> = () => (ast) => {
  const markdown = ast as mdast.Root;
  const html = transform(markdown);
  return html as unist.Root;
};

into:

import * as hast from 'hast';
import * as mdast from 'mdast';
import * as unist from 'unist';

const remarkRehype: Plugin<[RemarkRehypeOptions?], unknown, mdast.Root, hast.Root> = () => (ast) {
  const html = transform(ast);
  return html;
}

Unified processors can’t be properly typed due to their mutable nature. This means there’s no real point in specifying these types for packages written in JavaScript and contain separate type definitions. However, this is useful for unified plugins that are written in TypeScript themselves.

ChristianMurphy commented 3 years ago

However, this is useful for unified plugins that are written in TypeScript themselves.

Could you expand on what you mean by this? Some (perhaps most) TS adopters, myself included use unified's processor, meaning

Unified processors can’t be properly typed due to their mutable nature

applies.

Could you expand on how this would be used? And the challenge with having a base Node over a more specific one?

remcohaszing commented 3 years ago

I think https://github.com/remcohaszing/remark-mdx-frontmatter/blob/main/src/index.ts is a fairly simple example.

The transformer expects an MDX parent node. (I used a unist parent, because I’m not aware of any MDXAST types yet.)

Because transformers currently always specify the ast is of type Node, At the bottom the ast variable needs to be cast to a Parent node using ast as Parent. I believe type casting should be avoided if possible, as TypeScript should usually be able to infer the types from code.

This plugin will crash if a non-parent node is received. Or should any plugin be written to accept any type of unist node as input? :thinking:

ChristianMurphy commented 3 years ago

I think the solution for this would be to leverage the assertion properties of visit and is

For https://github.com/remcohaszing/remark-mdx-frontmatter/blob/83c2376124d89debe53503a0aa877546a59df29c/src/index.ts#L28-L34

visit(ast, ((node): node is TomlNode | YamlNode => node.type === 'toml' || node.type === 'yaml'), (node) => {
  // Only toml and yaml nodes will be passed in, typescript has already narrowed the type
}

For https://github.com/remcohaszing/remark-mdx-frontmatter/blob/83c2376124d89debe53503a0aa877546a59df29c/src/index.ts#L70

if (is<Parent>(ast, (node): node is Parent => typeof node.children !== 'undefined') {
  ast.children.unshift(...imports)
}

I believe type casting should be avoided if possible

:100: agree

The transformer expects an MDX parent node.

:thinking: I'm not as sure about this. Wouldn't specifying it can only be an MDX parent node preclude custom node types?

I used a unist parent, because I’m not aware of any MDXAST types yet.

Not yet, mdx uses the more generic node types https://github.com/mdx-js/mdx/blob/c2134e15089c9f90140c95b4556c334dbc4f118b/packages/remark-mdx/types/index.d.ts#L8 Though contributions to MDX or XDM adding the MDX and ESAST node types would be welcome!

This plugin will crash if a non-parent node is received. Or should any plugin be written to accept any type of unist node as input?

I'd lean towards this a bit, unist utils make it not-too-bad to more flexibly support different document/tree structures. Asserting the type of tree from a pure type perspective, could be ideal. It isn't supported, due to unified itself being not able carry the type checking through (due to Compiler and Stringifier order complexity). Because build time checking isn't supported, the end result would be a runtime error.


Bringing it full circle to making transformer generic, due to the current limitations of the types in Unified. I'm not sure how much benefit adding the generic here would bring?

It would hide runtime errors, which could be detected at build time.

remcohaszing commented 3 years ago

This plugin will crash if a non-parent node is received. Or should any plugin be written to accept any type of unist node as input?

I'd lean towards this a bit, unist utils make it not-too-bad to more flexibly support different document/tree structures.

Alright, with this reasoning I agree it might be better not to rely on the TypeScript types too much.

ChristianMurphy commented 3 years ago

FWIW, I've been thinking down a similar line, and ran into the same issue. Aside from the above suggestion, another I've been considering would be generating abstract data type brindings for mdast and typescript using something like https://github.com/sledorze/morphic-ts. Which would allow for predicates, matchers, lenses, property checking, etc all from one utility.

Which would offer an alternative/compliment to unist-util-is, with more features. But even then, until we find a way to get the compiler node type to flow through, respecting the order properties unified has, each plugin would still need to do some assertions to determine exactly what nodes are being passed in.