uhyo / the-html-programming-language

55 stars 2 forks source link

Abuse of some elements can make a THPL program non-conforming to the HTML standard #3

Open unsigned-wrong-wrong-int opened 3 years ago

unsigned-wrong-wrong-int commented 3 years ago

There are a couple of problems related to the content model of HTML elements.

ruby elements

According to the HTML Standard, the content model of ruby elements is a specific subset of phrasing content.

The content model of ruby elements consists of one or more of the following sequences:

  1. One or the other of the following:
    • Phrasing content, but with no ruby elements and with no ruby element descendants
    • A single ruby element that itself has no ruby element descendants
  2. One or the other of the following:
    • One or more rt elements
    • An rp element followed by one or more rt elements, each of which is itself followed by an rp element

The syntax of RUBY expressions is like following:

(expression)
<ruby>
(
   (expression) <!-- (A) -->
   <rt> (expression) <!-- (B) --> </rt>
)*
(
   <rt> (expression) </rt>
)?
</ruby>

The rule above is met only if expression (A) is either of the following:

The latter is, however, not a valid THPL expression, and thus (A) should be phrasing content without any ruby element. In other words, no sub-expressions of (A) can be a RUBY expression.

(Note: On the other hand, (B) may include RUBY expressions because the contents of rt elements are not restricted.)

THPL allows (A) or sub-expressions of (A) to be a RUBY expression, while the HTML standard does not.

meter elements

The content model of meter elements is also restricted.

Content Model: Phrasing content, but there must be no meter element descendants.

The syntax of METER expressions:

<meter> (expression) <!-- (C) --> </meter>

Using another METER expression for (C) forms a valid THPL expression, although the HTML standard prohibits nesting meter elements.

uhyo commented 3 years ago

Thank you for reporting this! Although I intended to bring less restriction due to the HTML standard, there are some flaws... 😥 I am thinking of how to fix this, targeting future 0.2.0 release.