Simplify `no-unclosed-tag` rule

AndrewJakubowicz commented 6 months ago

Context

This was inspired (and maybe closes) https://github.com/runem/lit-analyzer/issues/283.

The first commit went in a bad direction (which encouraged self-closing syntax (that doesn't exist in HTML)). Second commit changes the lint to use a simpler check which expects non void elements to be explicitly closed.

What is wrong with the current check?

The current no-unclosed-tag check grabs the source code locations for the tag from parse5, and then tries to guess if the tag was closed. This guess is inaccurate, and boils down to:

The node has no children
The startTag.endOffset location matches the node.endOffset location.

This results in edge cases.

Edge case 1

The first edge case can be seen with <p> which can be closed implicitly if followed by a <div>.

Consider:

<p><div></div><!--becomes:--><p></p><div></div>
<p>Some content<div></div><!--becomes:--><p>Some content</p><div></div>

Currently, the first case without content will not raise the no-unclosed-tag check, but the second case does raise the no-unclosed-tag check. This is because the first case passes both the condition that no children are present, and the implicitly closed tag will have the same offset location as the authored tag. But this is a false negative, because the resulting DOM is probably not what the author intended.

The second case is more correct because the <p> element created contains the Some content text and thus is not empty.

Fix

Make this lint require explicit closing tags by checking whether parse5 has got an authored location for the endTag. Because void elements do not have an endTag we also must handle them.

Test plan

All these edge cases have been captured in the tests.

43081j commented 6 months ago

maybe we can actually just check that there is an end tag?

for example:

  <p>i never close

will have a startTag location, and no endTag location.

similarly:

  <p />

will behave the same since you can't self-close a p tag (parse5 will already be taking this into account). so it too won't have an endTag and will actually have a child text node (the whitespace following it).

if you do the same with <br/>, parse5 will give you a node with no children, and no end tag. so it will already be doing a check against a whitelist of void elements (here)

so maybe we can just check !getNodeSourceLocation(p5Node).endTag?

i'm not sure why we didn't do that all along... unless i'm missing something 🤔

AndrewJakubowicz commented 6 months ago

Over the weekend I came to a similar conclusion @43081j so I totally agree with you.

The check can skip void tags, and otherwise check for the presence of the endTag. Will update today. Thanks for an amazing comment!

AndrewJakubowicz commented 6 months ago

cc @43081j and @rictic

This is ready for review. Thank you James for your exceptional feedback!

43081j commented 6 months ago

the CI failure is because one of the tests actually self-closes a custom element FYI 😬

so it already caught one!

AndrewJakubowicz commented 6 months ago

Excellent! :D Thank you!

Added a specific test case for <p> <div></div></p> (which continues to raise the diagnostic).
Fixed the self-closing custom element in the other test.

runem / lit-analyzer