syntax-tree / hast-util-raw

utility to reparse a hast tree
https://unifiedjs.com
MIT License
11 stars 4 forks source link

Nesting of list items when there is an html element in one of them #14

Closed mathieudutour closed 3 years ago

mathieudutour commented 3 years ago

Initial checklist

Affected packages and versions

6.1.0

Link to runnable example

https://codesandbox.io/s/remark-rehype-debug-forked-4hivb?file=/src/index.js

Steps to reproduce

Given the following markdown string:

- a <table> b
- c

rehype produce the following AST (with the allowDangerousHtml option) (position information stripped for clarity):

{
  "type": "root",
  "children": [
    {
      "type": "element",
      "tagName": "ul",
      "properties": {},
      "children": [
        {
          "type": "text",
          "value": "\n"
        },
        {
          "type": "element",
          "tagName": "li",
          "properties": {},
          "children": [
            {
              "type": "text",
              "value": "a "
            },
            {
              "type": "raw",
              "value": "<table>"
            },
            {
              "type": "text",
              "value": " b"
            }
          ]
        },
        {
          "type": "text",
          "value": "\n"
        },
        {
          "type": "element",
          "tagName": "li",
          "properties": {},
          "children": [
            {
              "type": "text",
              "value": "c"
            }
          ]
        },
        {
          "type": "text",
          "value": "\n"
        }
      ]
    }
  ]
} 

Running rehype-raw on it gives the following AST:

{
  "type": "root",
  "children": [
    {
      "type": "element",
      "tagName": "ul",
      "properties": {},
      "children": [
        {
          "type": "text",
          "value": "\n"
        },
        {
          "type": "element",
          "tagName": "li",
          "properties": {},
          "children": [
            {
              "type": "text",
              "value": "a  b\n"
            },
            {
              "type": "element",
              "tagName": "li",
              "properties": {},
              "children": [
                {
                  "type": "text",
                  "value": "c"
                }
              ]
            },
            {
              "type": "text",
              "value": "\n"
            },
            {
              "type": "element",
              "tagName": "table",
              "properties": {},
              "children": []
            }
          ]
        }
      ]
    }
  ]
} 

As you can see, the second list item is nested under the first one

Expected behavior

I would expect the list items to all be at the same level

Actual behavior

List items level are modified

Runtime

Node v16

Package manager

npm v7

OS

macOS

Build and bundle tools

No response

wooorm commented 3 years ago

I believe this behavior is expected because HTML (browsers) do the same for such input:

<ul>
<li>a <table> b</li>
<li>c</li>
</ul>

https://astexplorer.net/#/gist/74f889d718e49de2fd8bb6db9aed60cc/d52ee3598c1e6d0cbb2fb54a7c48b59b3e6a1b71

arobase-che commented 3 years ago

Firefox doesn't, Chromium does.

wooorm commented 3 years ago

Hmm, interesting.

This project parses the tree again, as if it was serialized HTML. An example of that is shown in the readme, where a h2 is inside a h1. The HTML spec prescribes that they then should be made adjacent to each other. The algorithm for that is extremely complex and defined here: https://html.spec.whatwg.org/multipage/parsing.html#tree-construction.

We are using parse5 to handle it, which is generally quite good. And if it matches Chrome, than I’d more likely assume that Chrome is also correct and Firefox wrong than the other way around. So I’m assuming the current actual behavior is the correct expected behavior.

Of course, it could be a bug in both Chrome and Parse5, bubbling up here. In that case, the issue should be raised and fixed there, as it’s not something that can be solved in this project.

github-actions[bot] commented 3 years ago

Hi! This was closed. Team: If this was fixed, please add phase/solved. Otherwise, please add one of the no/* labels.