syntax-tree / mdast

Markdown Abstract Syntax Tree format
https://unifiedjs.com
1.1k stars 45 forks source link

Drop the `TableHeader` node #6

Closed wooorm closed 8 years ago

wooorm commented 8 years ago

There’s no need to distinguish between table-headers and table-rows, as all markdown tables contain a single header row.

wooorm commented 8 years ago

@eush77 Thoughts?

wooorm commented 8 years ago

As a slight aside, maybe also rename tableRow to row and tableCell to cell?

eush77 commented 8 years ago

Table header is a row, so it certainly makes sense to merge the two. However, there is some semantic distinction, so I'm not so sure about this. It seems like table.children[0] and table.children.slice(1) is a common thing compilers (or plugins) would do (is it maybe just good enough?).

As a slight aside, maybe also rename tableRow to row and tableCell to cell?

Would you also rename listItem to item? I think it doesn't make sense to do one but not the other.

wooorm commented 8 years ago

You’re right, listItem > item is weird, same with the rows and cells!

The reason for dropping tableHeader, even through it has different semantics from rows, is that compilers indeed use [0] and slice(1): remark-html (same with -vdom, -react). So, including the different node types just leads to possibilities where compilers compile wrongly based on weird input.

If other markdown flavours included ways to add multiple header rows, e.g., if the following syntax would ever be considered OK, the tableHeader type is needed:

| foo | bar |
| baz | qux |
| --- | --- |
| quu | qxu |

But I doubt that would happen.

And a final reason to keep the tableHeader type: each compiler gets node, index, parent as arguments, if one were to correctly use the compiler to compile tableCells by patching a function on Compiler# (so unlike how remark-html currently does it), that compiler function would not be able to distinguish between which tagName should be used, th or td. (there’s a similar problem for align, currently happening). It’s a small problem, but still.

eush77 commented 8 years ago

The reason for dropping tableHeader, even through it has different semantics from rows, is that compilers indeed use [0] and slice(1): remark-html (same with -vdom, -react). So, including the different node types just leads to possibilities where compilers compile wrongly based on weird input.

Could you expand on this? As I see, compilers can either compile the table node as a whole and use children[0] and children.slice(1) or compile table-headers and table-rows separately. In the first case there's no reason to check child types at all (remark-html doesn't do that) and so using different node types for headers and rows doesn't change anything. Am I missing something?

In the second case (compiling table-headers and table-rows) compilers could benefit from removing tableHeader type so that they could have a single compiler for tableRow and switch by index where needed (but they can also assign the same handler for tableHeader and check the type so this is not a compelling reason IMO).

Re multiline headers: multiline headings (not the same thing, but close) could be coming in future versions of Commonmark:

We find interpretation 4 most natural, and interpretation 4 increases the expressive power of CommonMark, by allowing multiline headings.

wooorm commented 8 years ago

Could you expand on this? As I see, compilers can either compile the table node as a whole and use children[0] and children.slice(1) or compile table-headers and table-rows separately. In the first case there's no reason to check child types at all (remark-html doesn't do that) and so using different node types for headers and rows doesn't change anything. Am I missing something?

I meant that its possible for an AST to have two tableHeaders, or none at all, or first a tableRow, then a tableHeader. Thus, plug-ins will decide on how they handle malformed ASTs, probably each with a different solution. If there’s a single node type for table rows, there’s no way to produce malformed trees, and thus no way to incorrectly compile them.

In the second case (compiling table-headers and table-rows) compilers could benefit from removing tableHeader type so that they could have a single compiler for tableRow and switch by index where needed (but they can also assign the same handler for tableHeader and check the type so this is not a compelling reason IMO).

No, it’s not very compelling, indeed. My feeling is that it’s prettier to use index == 0 over a node-type for such a small difference though, a bit like the headings depth. Come to think, what about a property? Or is that too verbose?

Re multiline headers: multiline headings (not the same thing, but close) could be coming in future versions of Commonmark:

Oh wow, didn’t see that one coming. Not sure if they’ll ever implement it though. But it’s good to keep in mind! And, yeah I walking about table header rows :wink:

Generally, are you in favour of changing this? I think we now they up- and downsides now, and I’d say one less node-type is for the better!

eush77 commented 8 years ago

Thus, plug-ins will decide on how they handle malformed ASTs, probably each with a different solution. If there’s a single node type for table rows, there’s no way to produce malformed trees, and thus no way to incorrectly compile them.

Didn't this of that. +1 for removing.

I’d say one less node-type is for the better!

Definitely. +1 for removing.

Come to think, what about a property?

You mean something like node.role == 'header | 'row'`? No, that's too much.

Generally, are you in favour of changing this?

I was and remain neutral on this. Now I'm a little bit convinced for removing this type. It's not needed, and less moving parts is usually better.

wooorm commented 8 years ago

Good, then it’s decided, I’ll remove it. Less moving parts is more better!