Emoji Rendering Discrepancy Between Inline and Block Elements

https://markdown-it.github.io/#md3=%7B%22source%22%3A%22%3Cspan%3EInline%3A%20%26%23128578%3B%3C%2Fspan%3E%5Cn%5Cn%3Cdiv%3EBlock%3A%20%26%23128578%3B%3C%2Fdiv%3E%22%2C%22defaults%22%3A%7B%22html%22%3Afalse%2C%22xhtmlOut%22%3Afalse%2C%22breaks%22%3Afalse%2C%22langPrefix%22%3A%22language-%22%2C%22linkify%22%3Atrue%2C%22typographer%22%3Atrue%2C%22_highlight%22%3Atrue%2C%22_strict%22%3Atrue%2C%22_view%22%3A%22debug%22%7D%7D

markdown-it AST of the provided example will become as below:

[
  {
    "type": "paragraph_open",
    "tag": "p",
    "attrs": null,
    "map": [
      0,
      1
    ],
    "nesting": 1,
    "level": 0,
    "children": null,
    "content": "",
    "markup": "",
    "info": "",
    "meta": null,
    "block": true,
    "hidden": false
  },
  {
    "type": "inline",
    "tag": "",
    "attrs": null,
    "map": [
      0,
      1
    ],
    "nesting": 0,
    "level": 1,
    "children": [
      {
        "type": "html_inline",
        "tag": "",
        "attrs": null,
        "map": null,
        "nesting": 0,
        "level": 0,
        "children": null,
        "content": "<span>",
        "markup": "",
        "info": "",
        "meta": null,
        "block": false,
        "hidden": false
      },
      {
        "type": "text",
        "tag": "",
        "attrs": null,
        "map": null,
        "nesting": 0,
        "level": 0,
        "children": null,
        "content": "Inline: 🙂",
        "markup": "&#128578;",
        "info": "entity",
        "meta": null,
        "block": false,
        "hidden": false
      },
      {
        "type": "html_inline",
        "tag": "",
        "attrs": null,
        "map": null,
        "nesting": 0,
        "level": 0,
        "children": null,
        "content": "</span>",
        "markup": "",
        "info": "",
        "meta": null,
        "block": false,
        "hidden": false
      }
    ],
    "content": "<span>Inline: &#128578;</span>",
    "markup": "",
    "info": "",
    "meta": null,
    "block": true,
    "hidden": false
  },
  {
    "type": "paragraph_close",
    "tag": "p",
    "attrs": null,
    "map": null,
    "nesting": -1,
    "level": 0,
    "children": null,
    "content": "",
    "markup": "",
    "info": "",
    "meta": null,
    "block": true,
    "hidden": false
  },
  {
    "type": "html_block",
    "tag": "",
    "attrs": null,
    "map": [
      2,
      3
    ],
    "nesting": 0,
    "level": 0,
    "children": null,
    "content": "<div>Block: &#128578;</div>",
    "markup": "",
    "info": "",
    "meta": null,
    "block": true,
    "hidden": false
  }
]

Marp Core will transform an emoji within the content of inline markdown-it token into marp_unicode_emoji token, and render marp_unicode_emoji token as a twemoji SVG image.

https://github.com/marp-team/marp-core/blob/5c5eda0fb7ea9a202a3b0345202272bb0d9a457f/src/emoji/emoji.ts#L76-L109

On the other hand, the block element and its children are parsed as a single html_block token. Marp Core does not transform emojis within html_block token because may break raw HTML elements in some cases.

For emoji transformation in html_block token correctly, should implement a robust HTML parser and entity resolver, that are working in both Node.js and the browser. Unfortunately, we have not yet implemented them due to a lot of concerns:

html_block token may have only a part of the completed HTML block. So well-known HTML compliant parsers, such as browser's DOMParser, htmlparser2, and parse5 cannot use in our use case.
```
<div class="😄">

# Markdown content 👍

</div>
```
In above case, html_block token will be split into <div class="😄"> and </div>. When tried to parse and tranform these fragments with a known parser, the opening element will be unnecessarily closed due to HTML compliant behavior of auto-closing tags, and parsing the closing element will fail as invalid HTML.
If applied a simple string replacement, the raw HTML block may break in some edge cases.
- Raw JS: <script>document.title = "🙂";</script> ➡️ <script>document.title = "<img class="emoji" draggable="false" alt="🙂" src="https://twemoji.maxcdn.com/2/svg/1f642.svg" data-marp-twemoji="">";</script>

marp-team / marp-core

Emoji Rendering Discrepancy Between Inline and Block Elements #309

Version of Marp Tool

Operating System

Environment

How to reproduce

Expected behavior

Actual behavior

Additional information