🪦 Archived: this document is not maintained. This document was made jointly with
micromark
, which was later also turned intomarkdown-rs
. At present, I don’t have the bandwidth to maintain 2 reference parsers and a spec.
Common markup state machine.
Together, the parsing rules described below define what is referred to as a Common Markup parser.
This document is currently in progress. It is developed jointly with a reference parser:
micromark
. Contributions are welcome.Some parts that are still in progress:
- Adapters
- Define the regular constructs
- Adapter for rich text to check whether emphasis, strong, resource, or reference sequences make up syntax or text
- Tokenizing the input stream in reverse (GFM allows
asd@asd.com
, so it seems we need to somehow allow to match the@
and parse backwards)- Add an appendix of extensions
The common markup parser parses a markup language that is commonly known as Markdown.
The first definition of this format gave several examples of how it worked, showing input Markdown and output HTML, and came with a reference implementation (known as Markdown.pl). When new implementations followed, they mostly followed the first definition, but deviated from the first implementation, thus making the format a family of formats.
Some years later, an attempt was made to standardize the differences between implementations, by specifying how several edge cases should be handled, through more input and output examples. This attempt is known as CommonMark, and many implementations now follow it.
This document defines a more formal format, based on CommonMark, by documenting how to parse it, instead of documenting input and output examples. This format is:
The origin story of Markdown is similar to that of HTML, which at a time was also a family of formats. Through incredible efforts of the WHATWG, a Living Standard was created on how to parse the format, by defining a state machine.
The common markup parser receives input, typically coming over the network or from the local file system. This input is represented as characters in the input stream. Depending on a character, certain effects occur, such as that a new token is created, one state is switched to another, or something is labelled. Each line is made up of tokens, such as whitespace, markers, sequences, and content, and labels, that are both enqueued. At a certain point, it is known what to do with the queue, whether to discard it or to use it, in which case it is adapted.
The parser parses in three stages: flow, content, and text, respectively coming with their own state machines (flow state machine, content state machine, text state machine), and their own adapters.
This section defines the fundamental concepts upon which this document is built.
A variable is declared in the shared state with let
, cleared with unset
, or
changed with set
, increment
, decrement
, append
or prepend
.
A character is a Unicode code point and is represented as a four to six digit
hexadecimal number, prefixed with U+
([UNICODE]).
An ASCII digit is a character in the inclusive range U+0030 (0
) to U+0039 (9
).
An ASCII upper hex digit a character in the inclusive range U+0041 (A
) to U+0046 (F
).
An ASCII lower hex digit a character in the inclusive range U+0061 (a
) to U+0066 (f
).
An ASCII hex digit is an ASCII digit, ASCII upper hex digit, or an ASCII lower hex digit
An ASCII upper alpha is a character in the inclusive range U+0041 (A
) to U+005A (Z
).
An ASCII lower alpha is a character in the inclusive range U+0061 (a
) to U+007A (z
).
An ASCII alpha is an ASCII upper alpha or ASCII lower alpha.
An ASCII alphanumeric is an ASCII digit or ASCII alpha.
An ASCII punctuation is a character in the inclusive ranges U+0021 EXCLAMATION MARK (!
) to U+002F SLASH (/
), U+003A COLON (:
)
to U+0040 AT SIGN (@
), U+005B LEFT SQUARE BRACKET ([
) to U+0060 GRAVE ACCENT (`
), or U+007B LEFT CURLY BRACE ({
) to U+007E TILDE (~
).
An ASCII control is a character in the inclusive range U+0000 NULL (NUL) to U+001F (US), or U+007F (DEL).
A Unicode whitespace is a character in the Unicode Zs
(Separator, Space)
category, or U+0009 CHARACTER TABULATION (HT), U+000A LINE FEED (LF), U+000C (FF), or U+000D CARRIAGE RETURN (CR) ([UNICODE]).
A Unicode punctuation is a character in the Unicode Pc
(Punctuation,
Connector), Pd
(Punctuation, Dash), Pe
(Punctuation, Close), Pf
(Punctuation, Final quote), Pi
(Punctuation, Initial quote), Po
(Punctuation, Other), or Ps
(Punctuation, Open) categories, or an ASCII
punctuation ([UNICODE]).
An atext is an ASCII alphanumeric, or a character in the inclusive
ranges U+0023 NUMBER SIGN (#
) to U+0027 APOSTROPHE ('
), U+002A ASTERISK (*
), U+002B PLUS SIGN (+
), U+002D DASH (-
), U+002F SLASH (/
), U+003D EQUALS TO (=
), U+003F QUESTION MARK (?
), U+005E CARET (^
) to U+0060 GRAVE ACCENT (`
), or U+007B LEFT CURLY BRACE ({
) to U+007E TILDE (~
)
([RFC5322]).
To ASCII-lowercase a character, is to increase it by 0x20
, if it an
ASCII upper alpha.
To digitize a character, is to decrease it by 0x30
, 0x37
, or 0x57
,
if it is an ASCII digit, ASCII upper hex digit, or
ASCII lower hex digit, respectively.
A VIRTUAL SPACE character is a conceptual character representing an expanded column size of a U+0009 CHARACTER TABULATION (HT).
An EOL character is a conceptual character representing a break between two lines.
An EOF character is a conceptual character representing the end of the input.
VIRTUAL SPACE, EOL, and EOF are not real characters, but rather represent a character increase the size of a character, a break between characters, or the lack of any further characters.
Tabs (U+0009 CHARACTER TABULATION (HT)) are typically not expanded into spaces, but do behave as if they were replaced by spaces with a tab stop of 4 characters. These character increments are represented by VIRTUAL SPACE characters.
For the following markup (where ␉
represent a tab):
>␉␉a
We have the characters: U+003E GREATER THAN (>
), U+0009 CHARACTER TABULATION (HT), VIRTUAL SPACE, VIRTUAL SPACE, U+0009 CHARACTER TABULATION (HT), VIRTUAL SPACE, VIRTUAL SPACE, VIRTUAL SPACE, and U+0061 (a
).
When transforming to an output format, tab characters that are not part of syntax should be present in the output format. When the tab itself (and zero or more VIRTUAL SPACE characters) are part of syntax, but some VIRTUAL SPACE characters are not, the remaining VIRTUAL SPACE characters should be considered a prefix of the content.
The input stream consists of the characters pushed into it.
The input character is the first character in the input stream that has not yet been consumed. Initially, the input character is the first character in the input. When the last character in a line is consumed, the input character is an EOL. Finally, when all character are consumed, the input character is an EOF.
Any occurrences of U+0009 CHARACTER TABULATION (HT) in the input stream is represented by that character and 0-3 VIRTUAL SPACE characters.
The input stream consists of the characters pushed into it as the input is decoded.
The input, when decoded, is preprocessed and pushed into the input stream as described in the following algorithm:
tabSize
be 4
line
be 1
column
be 1
offset
be 0
Check:
If offset
is equal to the length of the document, push an EOF into the
input stream representing the lack of any further characters, and return
Otherwise, if the current character is:
↪ U+0000 NULL (NUL)
Increment offset
by 1
, increment column
by 1
, push a U+FFFD REPLACEMENT CHARACTER (�
) into
the input stream, and go to the step labelled check
↪ U+0009 CHARACTER TABULATION (HT)
Set count
to the result of calculating (tabSize - 1) - (column % tabSize)
.
Increment offset
by 1
, increment column
by 1
, and push the
character into the input stream.
Perform the following steps count
times: increment column
by 1
and
push a VIRTUAL SPACE into the input stream representing the size increase.
Finally, go to the step labelled check
↪ U+000A LINE FEED (LF)
Increment offset
by 1
, increment line
by 1
, and set column
to
1
, push an EOL into the input stream representing the character,
and go to the step labelled check
↪ U+000D CARRIAGE RETURN (CR)
Increment offset
by 1
, increment line
by 1
, set column
to 1
,
and go to the step labelled carriage return check
↪ Anything else
Increment offset
by 1
, increment column
by 1
, push the character
into the input stream, and go to the step labelled check
Carriage return check: if the current character is:
↪ U+000A LINE FEED (LF)
Increment offset
by 1
and push an EOL into the input stream
representing the previous and current characters
↪ Anything else
Push an EOL into the input stream representing the previous character and perform the step labelled check on the current character
The states of state machines have certain effects, such as that they create items in the queue (tokens and labels). The queue is used by tree adapters, in case a valid construct is found. After using the queue, or when in a bogus construct is found, the queue is discarded.
The shared space is accessed and mutated by both the tree adapter and the states of the state machine.
Constructs are registered by hooking a case (one or more characters or character groups) into certain states. Upon registration, they define the states used to parse a construct, and the adapter used to handle the construct.
Implementations must act as if they use several state machines to tokenize common markup. The flow state machine is used to tokenize the line constructs that make up the structure of the document. The content state machine is used to tokenize the inline constructs part of content blocks. The text state machine is used to tokenize the inline constructs part of rich or plain text.
Most states consume the input character, and either remain in the state to consume the next character, reconsume the input character in a different state, or switch to a different state to consume the next character. States enqueue tokens and labels.
The shared space is a map of key/value pairs.
The queue is a list of tokens and labels that are enqueued. The current token is the last token in the queue.
Markup is parsed per construct. Some constructs are considered regular (those from CommonMark, such as ATX headings) and other constructs are extensions (such as YAML frontmatter or MDX).
❗️ Define constructs.
To switch to a state is to wait for the next character in the given state.
To consume the input character affects the current token. Due to the nature of the state machine, it is not possible to consume if there is no current token.
To reconsume is to switch to the given state, and consume the input character there.
To enqueue a label is to mark a point between two tokens with a semantic name, at which point there is no current token.
To enqueue a token is to add a new token of the given type to the queue, making it the new current token.
To ensure a token is to enqueue that token if the current token is not of the given type, and otherwise do nothing.
The flow state machine is used to tokenize the line constructs that make up the structure of the document (such as headings or thematic breaks) and must start in the Flow prefix start state.
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Reconsume in the Flow start state
Hookable, there are no regular hooks
↪ Anything else
Reconsume in the Flow initial state
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Reconsume in the Flow initial state
❗️ Todo: Indented code v.s. content
Hookable, the regular hooks are:
- [x] EOL: Blank line state
- [x] U+0023 NUMBER SIGN (
#
): ATX heading start state- [x] *U+002A ASTERISK (``)*: [Thematic break asterisk start state*]s-thematic-break-asterisk-start
- [ ] *U+002A ASTERISK (``)**:
- [ ] U+002B PLUS SIGN (
+
):- [x] U+002D DASH (
-
): Setext heading underline dash start state- [x] U+002D DASH (
-
): Thematic break dash start state- [ ] U+002D DASH (
-
):- [x] U+003C LESS THAN (
<
): Flow HTML start state- [x] U+003D EQUALS TO (
=
): Setext heading underline equals to start state- [ ] U+003E GREATER THAN (
>
):- [x] U+005F UNDERSCORE (
_
): Thematic break underscore start state- [x] U+0060 GRAVE ACCENT (
`
): Fenced code grave accent start state- [x] U+007E TILDE (
~
): Fenced code tilde start state- [ ] ASCII digit:
❗️ Todo, continuation:
↪ EOF
Enqueue an End-of-file token
↪ Anything else
Reconsume in the Flow content state
↪ EOL
Enqueue a Blank line end label, enqueue an End-of-line token, and consume
↪ Anything else
Enqueue a NOK label
↪ U+0023 NUMBER SIGN (#
)
Let sizeFence
be 1
, enqueue an ATX heading start label, enqueue an
ATX heading fence start label, enqueue a Sequence token, consume, and switch to the
ATX heading fence open inside state
↪ Anything else
Enqueue a NOK label
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Unset sizeFence
, enqueue an ATX heading fence end label and reconsume in the
ATX heading inside state
↪ U+0023 NUMBER SIGN (#
)
If sizeFence
is not 6
, increment sizeScheme
by 1
and consume
Otherwise, treat it as per the “anything else” entry below
↪ Anything else
Unset sizeFence
and enqueue a NOK label
↪ EOF
Enqueue an ATX heading end label and enqueue an End-of-file token
↪ EOL
Enqueue an ATX heading end label, enqueue an End-of-line token, consume, and switch to the Flow prefix initial state
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+0023 NUMBER SIGN (#
)
Ensure a Sequence token and consume
↪ Anything else
Ensure a Content token and consume
↪ *U+002A ASTERISK (``)**
Let sizeTotalSequence
be 1
, enqueue a Thematic break start label, enqueue a
Sequence token, consume, and switch to the Thematic break asterisk inside state
↪ Anything else
Enqueue a NOK label
↪ EOF
If sizeTotalSequence
is greater than or equal to 3
, unset
sizeTotalSequence
, enqueue a Thematic break end label, and enqueue an
End-of-file token
Otherwise, treat it as per the “anything else” entry below
↪ EOL
If sizeTotalSequence
is greater than or equal to 3
, unset
sizeTotalSequence
, enqueue a Thematic break end label, enqueue an
End-of-line token, consume, and switch to the Flow prefix initial state
Otherwise, treat it as per the “anything else” entry below
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ *U+002A ASTERISK (``)**
Increment sizeTotalSequence
by 1
, ensure a Sequence token, and consume
↪ Anything else
Unset sizeTotalSequence
and enqueue a NOK label
❗️ Todo: exit if not preceded by content
↪ U+002D DASH (-
)
Enqueue a Setext heading underline start label, enqueue a Sequence token, consume, and switch to the Setext heading underline dash inside state
↪ Anything else
Enqueue a NOK label
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Reconsume in the Setext heading underline dash after state
↪ U+002D DASH (-
)
Consume
↪ Anything else
Enqueue a NOK label
❗️ Todo: Close content if ok, create a new content if nok
↪ EOF
Enqueue a Setext heading underline end label and enqueue an End-of-file token
↪ EOL
Enqueue a Setext heading underline end label, enqueue an End-of-line token, consume, and switch to the Flow prefix initial state
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Enqueue a NOK label
↪ U+002D DASH (-
)
Let sizeTotalSequence
be 1
, enqueue a Thematic break start label, enqueue a
Sequence token, consume, and switch to the Thematic break dash inside state
↪ Anything else
Enqueue a NOK label
↪ EOF
If sizeTotalSequence
is greater than or equal to 3
, unset
sizeTotalSequence
, enqueue a Thematic break end label, and enqueue an
End-of-file token
Otherwise, treat it as per the “anything else” entry below
↪ EOL
If sizeTotalSequence
is greater than or equal to 3
, unset
sizeTotalSequence
, enqueue a Thematic break end label, enqueue an
End-of-line token, consume, and switch to the Flow prefix initial state
Otherwise, treat it as per the “anything else” entry below
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+002D DASH (-
)
Increment sizeTotalSequence
by 1
, ensure a Sequence token, and consume
↪ Anything else
Unset sizeTotalSequence
and enqueue a NOK label
↪ U+003C LESS THAN (<
)
Let kind
be 0
, let endTag
be null
, let tagName
be the empty
string, enqueue a Content token, consume, and switch to the Flow HTML tag open state
↪ Anything else
Enqueue a NOK label
↪ U+0021 EXCLAMATION MARK (!
)
Consume and switch to the Flow HTML markup declaration open state
↪ U+002F SLASH (/
)
Set endTag
to true
, consume, and switch to the Flow HTML end tag open state
↪ U+003F QUESTION MARK (?
)
Set kind
to 3
, unset endTag
, consume, and switch to the
Flow HTML continuation declaration before state
Append the ASCII-lowercased character to tagName
, consume, and switch
to the Flow HTML tag name state
↪ Anything else
Unset kind
, endTag
, and tagName
, and enqueue a NOK label
↪ --
(two U+002D DASH (-
) characters)
Set kind
to 2
, unset endTag
, consume, and switch to the
Flow HTML continuation declaration before state
↪ [CDATA[
(the five upper letters “CDATA” with a U+005B LEFT SQUARE BRACKET ([
) before and
after)
Set kind
to 5
, unset endTag
, consume, and switch to the
Flow HTML continuation state
Set kind
to 4
, unset endTag
, consume, and switch to the
Flow HTML continuation state
↪ Anything else
Unset kind
, endTag
, and tagName
, and enqueue a NOK label
Append the ASCII-lowercased character to tagName
, consume, and switch
to the Flow HTML tag name state
↪ Anything else
Unset kind
, endTag
, and tagName
, and enqueue a NOK label
If tagName
is a raw tag and endTag
is not true
, set kind
to 1
,
unset tagName
, unset endTag
, and reconsume in the
Flow HTML continuation state
Otherwise, if tagName
is a basic tag, set kind
to 6
, unset
tagName
, unset endTag
, and reconsume in the Flow HTML continuation state
Otherwise, treat it as per the “anything else” entry below
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
If tagName
is a raw tag and endTag
is not true
, set kind
to 1
,
unset tagName
, unset endTag
, and switch to the Flow HTML continuation state
❗️ Todo: ignore this check if interrupting content.
Otherwise, if tagName
is not a raw tag, unset tagName
, consume, and
switch to the Flow HTML complete attribute name before state
Otherwise, treat it as per the “anything else” entry below
↪ U+002D DASH (-
)
Append the character to tagName
and consume
↪ U+002F SLASH (/
)
If tagName
is a basic tag, unset tagName
, consume, and switch to the
Flow HTML basic self closing state
❗️ Todo: ignore this check if interrupting content.
Otherwise, if tagName
is not a raw tag and endTag
is not true
,
unset tagName
, consume, and switch to the
Flow HTML complete self closing state
Otherwise, treat it as per the “anything else” entry below
↪ U+003E GREATER THAN (>
)
If tagName
is a raw tag and endTag
is not true
, set kind
to 1
,
unset tagName
, unset endTag
, consume, and switch to the
Flow HTML continuation state
Otherwise, if tagName
is a basic tag, set kind
to 6
, unset
tagName
, unset endTag
, and reconsume in the Flow HTML continuation state
❗️ Todo: ignore this check if interrupting content.
Otherwise, if tagName
is not a raw tag, unset tagName
, consume, and
switch to the Flow HTML complete tag after state
Otherwise, treat it as per the “anything else” entry below
Append the ASCII-lowercased character to tagName
and consume
↪ Anything else
Unset kind
, endTag
, and tagName
, and enqueue a NOK label
↪ U+003E GREATER THAN (>
)
Set kind
to 6
, unset endTag
, consume, and switch to the
Flow HTML continuation state
↪ Anything else
Unset kind
and endTag
and enqueue a NOK label
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Consume
↪ U+002F SLASH (/
)
If endTag
is not true
, consume and switch to the
Flow HTML complete self closing state
Otherwise, treat it as per the “anything else” entry below
↪ U+003A COLON (:
)\
↪ U+005F UNDERSCORE (_
)\
↪ ASCII alpha
If endTag
is not true
, consume and switch to the
Flow HTML complete attribute name state
Otherwise, treat it as per the “anything else” entry below
↪ U+003E GREATER THAN (>
)
Consume and switch to the Flow HTML complete tag after state
↪ Anything else
Unset kind
and endTag
and enqueue a NOK label
↪ U+002D DASH (-
)\
↪ U+002E DOT (.
)\
↪ U+003A COLON (:
)\
↪ U+005F UNDERSCORE (_
)\
↪ ASCII alphanumeric
Consume
↪ Anything else
Reconsume in the Flow HTML complete attribute name after state
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Consume
↪ U+002F SLASH (/
)
If endTag
is not true
, consume and switch to the
Flow HTML complete self closing state
Otherwise, treat it as per the “anything else” entry below
↪ U+003D EQUALS TO (=
)
Consume and switch to the Flow HTML complete attribute value before state
↪ U+003E GREATER THAN (>
)
Consume and switch to the Flow HTML complete tag after state
↪ Anything else
Unset kind
and endTag
and enqueue a NOK label
↪ EOF\
↪ EOL\
↪ U+003C LESS THAN (<
)\
↪ U+003D EQUALS TO (=
)\
↪ U+003E GREATER THAN (>
)\
↪ U+0060 GRAVE ACCENT (`
)
Unset kind
and endTag
and enqueue a NOK label
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Consume
↪ U+0022 QUOTATION MARK ("
)
Consume and switch to the Flow HTML complete attribute value double quoted state
↪ U+0027 APOSTROPHE ('
)
Consume and switch to the Flow HTML complete attribute value single quoted state
↪ Anything else
Consume and switch to the Flow HTML complete attribute value unquoted state
Unset kind
and endTag
and enqueue a NOK label
↪ U+0022 QUOTATION MARK ("
)
Consume and switch to the Flow HTML complete attribute name before state
↪ Anything else
Consume
Unset kind
and endTag
and enqueue a NOK label
↪ U+0027 APOSTROPHE ('
)
Consume and switch to the Flow HTML complete attribute name before state
↪ Anything else
Consume
↪ EOF\
↪ EOL\
↪ VIRTUAL SPACE\
↪ U+0009 CHARACTER TABULATION (HT)\
↪ U+0020 SPACE (SP)\
↪ U+0022 QUOTATION MARK ("
)\
↪ U+0027 APOSTROPHE ('
)\
↪ U+003C LESS THAN (<
)\
↪ U+003D EQUALS TO (=
)\
↪ U+003E GREATER THAN (>
)\
↪ U+0060 GRAVE ACCENT (`
)
Reconsume in the Flow HTML complete attribute name after state
↪ Anything else
Consume
↪ U+003E GREATER THAN (>
)
Consume and switch to the Flow HTML complete tag after state
↪ Anything else
Unset kind
and endTag
and enqueue a NOK label
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Set kind
to 7
, unset endTag
, and reconsume in the
Flow HTML continuation state
↪ Anything else
Unset kind
and endTag
and enqueue a NOK label
↪ EOF
Unset kind
, enqueue an HTML end label, and enqueue an End-of-file token
↪ EOL
Enqueue an End-of-line token, consume, and switch to the Flow prefix initial state
↪ U+002D DASH (-
)
If kind
is 7
, consume and switch to the
Flow HTML continuation comment inside state
Otherwise, treat it as per the “anything else” entry below
↪ U+003C LESS THAN (<
)
If kind
is 1
, consume and switch to the
Flow HTML continuation raw tag open state
Otherwise, treat it as per the “anything else” entry below
↪ U+003E GREATER THAN (>
)
If kind
is 4
, consume and switch to the Flow HTML continuation close state
Otherwise, treat it as per the “anything else” entry below
↪ U+003F QUESTION MARK (?
)
If kind
is 3
, consume and switch to the
Flow HTML continuation declaration before state
Otherwise, treat it as per the “anything else” entry below
↪ U+005D RIGHT SQUARE BRACKET (]
)
If kind
is 5
, consume and switch to the
Flow HTML continuation character data inside state.
Otherwise, treat it as per the “anything else” entry below
↪ Anything else
Consume
↪ U+002D DASH (-
)
Consume and switch to the Flow HTML continuation declaration before state
↪ Anything else
Reconsume in the Flow HTML continuation state
↪ U+002F SLASH (/
)
Let tagName
be the empty string, consume, and switch to the
Flow HTML continuation raw end tag state
↪ Anything else
Reconsume in the Flow HTML continuation state
Note: This state can be optimized by either imposing a maximum size (the size of the longest possible raw tag name) or by using a trie of the possible raw tag names.
Append the ASCII-lowercased character to tagName
and consume
↪ U+003E GREATER THAN (>
)
If tagName
is a raw tag, unset `tagName, consume, and switch to the
Flow HTML continuation close state
Otherwise, treat it as per the “anything else” entry below
↪ Anything else
Unset `tagName and reconsume in the Flow HTML continuation state
↪ U+005D RIGHT SQUARE BRACKET (]
)
Consume and switch to the Flow HTML continuation declaration before state
↪ Anything else
Reconsume in the Flow HTML continuation state
↪ U+003E GREATER THAN (>
)
Consume and switch to the Flow HTML continuation close state
↪ Anything else
Reconsume in the Flow HTML continuation state
↪ EOF
Unset kind
, enqueue an HTML end label, and enqueue an End-of-file token
↪ EOL
Unset kind
, enqueue an HTML end label, and enqueue an End-of-line token, consume,
and switch to the Flow prefix initial state
↪ Anything else
Consume
❗️ Todo: exit if not preceded by content
↪ U+003D EQUALS TO (=
)
Enqueue a Setext heading underline start label, enqueue a Sequence token, consume, and switch to the Setext heading underline equals to inside state
↪ Anything else
Enqueue a NOK label
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Reconsume in the Setext heading underline equals to after state
↪ U+003D EQUALS TO (=
)
Consume
↪ Anything else
Enqueue a NOK label
❗️ Todo: Close content if ok, create a new content if nok
↪ EOF
Enqueue a Setext heading underline end label and enqueue an End-of-file token
↪ EOL
Enqueue a Setext heading underline end label, enqueue an End-of-line token, consume, and switch to the Flow prefix initial state
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Enqueue a NOK label
↪ U+005F UNDERSCORE (_
)
Let sizeTotalSequence
be 1
, enqueue a Thematic break start label, enqueue a
Sequence token, consume, and switch to the Thematic break underscore inside state
↪ Anything else
Enqueue a NOK label
↪ EOF
If sizeTotalSequence
is greater than or equal to 3
, unset
sizeTotalSequence
, enqueue a Thematic break end label, and enqueue an
End-of-file token
Otherwise, treat it as per the “anything else” entry below
↪ EOL
If sizeTotalSequence
is greater than or equal to 3
, unset
sizeTotalSequence
, enqueue a Thematic break end label, enqueue an
End-of-line token, consume, and switch to the Flow prefix initial state
Otherwise, treat it as per the “anything else” entry below
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+005F UNDERSCORE (_
)
Increment sizeTotalSequence
by 1
, ensure a Sequence token, and consume
↪ Anything else
Unset sizeTotalSequence
and enqueue a NOK label
↪ U+0060 GRAVE ACCENT (`
)
Let sizeOpen
be 1
, enqueue a Fenced code start label, enqueue a
Fenced code fence start label, enqueue a Fenced code fence sequence start label,
enqueue a Sequence token, consume, and switch to the
Fenced code grave accent open fence inside state
↪ Anything else
Enqueue a NOK label
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
If sizeOpen
is greater than or equal to 3
, enqueue a
Fenced code fence sequence end label, and reconsume in the
Fenced code grave accent open fence after state
Otherwise, treat it as per the “anything else” entry below
↪ U+0060 GRAVE ACCENT (`
)
Increment sizeOpen
by 1
and consume
↪ Anything else
Unset sizeOpen
and enqueue a NOK label
↪ EOF
Enqueue a Fenced code fence end label and enqueue an End-of-file token
↪ EOL
Enqueue a Fenced code fence end label, enqueue an End-of-line token, consume, and switch to the Flow prefix initial state
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+0060 GRAVE ACCENT (`
)
Unset sizeOpen
and enqueue a NOK label
↪ Anything else
Ensure a Content token and consume
↪ U+0060 GRAVE ACCENT (`
)
Let sizeClose
be 1
, enqueue a Fenced code fence sequence start label,
enqueue a Sequence token, consume, and switch to the
Fenced code grave accent close fence inside state
↪ Anything else
Reconsume in the Fenced code grave accent continuation inside state
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
If sizeClose
is greater than or equal to sizeOpen
, enqueue a
Fenced code fence sequence end label and reconsume in the
Fenced code grave accent close fence after state
Otherwise, treat it as per the “anything else” entry below
↪ U+0060 GRAVE ACCENT (`
)
Increment sizeClose
by 1
and consume
↪ Anything else
Unset sizeClose
and reconsume in the
Fenced code grave accent continuation inside state
↪ EOF
Unset sizeOpen
, unset sizeClose
, enqueue a Fenced code fence end label,
enqueue a Fenced code end label, and enqueue an End-of-file token
↪ EOL
Unset sizeOpen
, unset sizeClose
, enqueue a Fenced code fence end label,
enqueue a Fenced code end label, enqueue an End-of-line token, consume, and switch
to the Flow prefix initial state
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Reconsume in the Fenced code grave accent continuation inside state
↪ EOF
Unset sizeOpen
, unset sizeClose
, enqueue a Fenced code fence end label,
enqueue a Fenced code end label, and enqueue an End-of-file token
↪ EOL
Enqueue an End-of-line token, consume, and switch to the Flow prefix initial state
↪ Anything else
Ensure a Content token and consume
↪ U+007E TILDE (~
)
Let sizeOpen
be 1
, enqueue a Fenced code start label, enqueue a
Fenced code fence start label, enqueue a Fenced code fence sequence start label,
enqueue a Sequence token, consume, and switch to the
Fenced code tilde open fence inside state
↪ Anything else
Enqueue a NOK label
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
If sizeOpen
is greater than or equal to 3
, enqueue a
Fenced code fence sequence end label, and reconsume in the
Fenced code tilde open fence after state
Otherwise, treat it as per the “anything else” entry below
↪ U+007E TILDE (~
)
Increment sizeOpen
by 1
and consume
↪ Anything else
Unset sizeOpen
and enqueue a NOK label
↪ EOF
Enqueue a Fenced code fence end label and enqueue an End-of-file token
↪ EOL
Enqueue a Fenced code fence end label, enqueue an End-of-line token, consume, and switch to the Flow prefix initial state
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Ensure a Content token and consume
↪ U+007E TILDE (~
)
Let sizeClose
be 1
, enqueue a Fenced code fence sequence start label,
enqueue a Sequence token, consume, and switch to the
Fenced code tilde close fence inside state
↪ Anything else
Reconsume in the Fenced code tilde continuation inside state
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
If sizeClose
is greater than or equal to sizeOpen
, enqueue a
Fenced code fence sequence end label, and reconsume in the
Fenced code tilde close fence after state
Otherwise, treat it as per the “anything else” entry below
↪ U+007E TILDE (~
)
Increment sizeClose
by 1
and consume
↪ Anything else
Unset sizeClose
and reconsume in the
Fenced code tilde continuation inside state
↪ EOF
Unset sizeOpen
, unset sizeClose
, enqueue a Fenced code fence end label,
enqueue a Fenced code end label, and enqueue an End-of-file token
↪ EOL
Unset sizeOpen
, unset sizeClose
, enqueue a Fenced code fence end label,
enqueue a Fenced code end label, enqueue an End-of-line token, consume, and switch
to the Flow prefix initial state
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Reconsume in the Fenced code tilde continuation inside state
↪ EOF
Unset sizeOpen
, unset sizeClose
, enqueue a Fenced code fence end label,
enqueue a Fenced code end label, and enqueue an End-of-file token
↪ EOL
Enqueue an End-of-line token, consume, and switch to the Flow prefix initial state
↪ Anything else
Ensure a Content token and consume
↪ EOF
Enqueue an End-of-file token
↪ EOL
Enqueue an End-of-line token, consume, and switch to the Flow prefix initial state
↪ Anything else
Consume
The content state machine is used to tokenize the inline constructs part of content blocks in a document (such as regular definitions and phrasing) and must start in the Content start state.
Hookable, the regular hooks are:
- U+005B LEFT SQUARE BRACKET (
[
): Definition label start state
↪ Anything else
Reconsume in the Content initial state
Hookable, there are no regular hooks.
↪ Anything else
Reconsume in the Phrasing content state
↪ U+005B LEFT SQUARE BRACKET ([
)
Enqueue a Content definition start label, enqueue a Content definition label start label, enqueue a Marker token, consume, enqueue a Content definition label open label, and switch to the Definition label before state
↪ Anything else
Enqueue a NOK label
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue an End-of-line token and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+005D RIGHT SQUARE BRACKET (]
)
Enqueue a NOK label
↪ Anything else
Enqueue a Content token, consume, and switch to the Definition label inside state
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Reconsume in the Definition label between state
↪ U+005C BACKSLASH (\
)
Ensure a Content token, consume, and switch to the Definition label escape state
↪ U+005D RIGHT SQUARE BRACKET (]
)
Enqueue a Content definition label close label, enqueue a Marker token, consume, enqueue a Content definition label end label, and switch to the Definition label after state
↪ Anything else
Ensure a Content token and consume
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue an End-of-line token and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Reconsume in the Definition label inside state
↪ U+005C BACKSLASH (\
)\
↪ U+005D RIGHT SQUARE BRACKET (]
)
Consume and switch to the Definition label inside state
↪ Anything else
Reconsume in the Definition label inside state
↪ U+003A COLON (:
)
Enqueue a Marker token, consume, and switch to the Definition destination before state
↪ Anything else
Enqueue a NOK label
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue an End-of-line token and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+003C LESS THAN (<
)
Enqueue a Content definition destination start label, enqueue a Marker token, enqueue a Content definition destination quoted open label, consume, and switch to the Definition destination quoted inside state
Enqueue a NOK label
↪ Anything else
Let balance
be 0
, enqueue a Content definition destination start label,
enqueue a Content definition destination unquoted open label, enqueue a
Content token, and reconsume in the Definition destination unquoted inside state
↪ EOF\
↪ EOL\
↪ U+003C LESS THAN (<
)
Enqueue a NOK label
↪ U+003E GREATER THAN (>
)
Enqueue a Content definition destination quoted close label, enqueue a Marker token, consume, enqueue a Content definition destination end label, and switch to the Definition destination after state
↪ U+005C BACKSLASH (\
)
Ensure a Content token, consume, and switch to the Definition destination quoted escape state
↪ Anything else
Ensure a Content token and consume
↪ U+003C LESS THAN (<
)\
↪ U+003E GREATER THAN (>
)\
↪ U+005C BACKSLASH (\
)
Consume and switch to the Definition destination quoted inside state
↪ Anything else
Reconsume in the Definition destination quoted inside state
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Unset balance
, enqueue a Content definition destination unquoted close label,
enqueue a Content definition destination end label, and reconsume in the
Definition title before state
↪ U+0028 LEFT PARENTHESIS ((
)
Increment balance
by 1
, ensure a Content token, and consume
↪ U+0029 RIGHT PARENTHESIS ()
)
If balance
is 0
, treat it as per the “ASCII control” entry below
Otherwise, decrement balance
by 1
, ensure a Content token, and consume
↪ U+005C BACKSLASH (\
)
Ensure a Content token, consume, and switch to the Definition destination unquoted escape state
Unset balance
and enqueue a NOK label
↪ Anything else
Ensure a Content token and consume
↪ U+0028 LEFT PARENTHESIS ((
)\
↪ U+0029 RIGHT PARENTHESIS ()
)\
↪ U+005C BACKSLASH (\
)
Consume and switch to the Definition destination unquoted inside state
↪ Anything else
Reconsume in the Definition destination unquoted inside state
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Reconsume in the Definition title before state
↪ Anything else
Enqueue a NOK label
↪ EOL
Enqueue a Content definition partial label, enqueue an End-of-line token, consume, and switch to the Definition title or label before state
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+005B LEFT SQUARE BRACKET ([
)
Enqueue a NOK label
↪ Anything else
Reconsume in the Definition title or label before state
↪ EOF
Enqueue a Content definition end label and enqueue an End-of-file token
↪ EOL
Enqueue an End-of-line token and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+0022 QUOTATION MARK ("
)
Enqueue a Content definition title start label, enqueue a Marker token, consume, enqueue a Content definition title open label, and switch to the Definition title double quoted state
↪ U+0027 APOSTROPHE ('
)
Enqueue a Content definition title start label, enqueue a Marker token, consume, enqueue a Content definition title open label, and switch to the Definition title single quoted state
↪ U+0028 LEFT PARENTHESIS ((
)
Enqueue a Content definition title start label, enqueue a Marker token, consume, enqueue a Content definition title open label, and switch to the Definition title paren quoted state
↪ Anything else
Reconsume in the Content start state
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Reconsume in the Definition title double quoted between state
↪ U+0022 QUOTATION MARK ("
)
Enqueue a Content definition title close label, enqueue a Marker token, consume, enqueue a Content definition title end label, and switch to the Definition title after state
↪ U+005C BACKSLASH (\
)
Ensure a Content token, consume, and switch to the Definition title double quoted escape state
↪ Anything else
Ensure a Content token and consume
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue an End-of-line token and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Reconsume in the Definition title double quoted state
↪ U+0022 QUOTATION MARK ("
)\
↪ U+005C BACKSLASH (\
)
Consume and switch to the Definition title double quoted state
↪ Anything else
Reconsume in the Definition title double quoted state
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Reconsume in the Definition title single quoted between state
↪ U+0027 APOSTROPHE ('
)
Enqueue a Content definition title close label, enqueue a Marker token, consume, enqueue a Content definition title end label, and switch to the Definition title after state
↪ U+005C BACKSLASH (\
)
Ensure a Content token, consume, and switch to the Definition title single quoted escape state
↪ Anything else
Ensure a Content token and consume
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue an End-of-line token and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Reconsume in the Definition title single quoted state
↪ U+0027 APOSTROPHE ('
)\
↪ U+005C BACKSLASH (\
)
Consume and switch to the Definition title single quoted state
↪ Anything else
Reconsume in the Definition title single quoted state
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Reconsume in the Definition title paren quoted between state
↪ U+0029 RIGHT PARENTHESIS ()
)
Enqueue a Content definition title close label, enqueue a Marker token, consume, enqueue a Content definition title end label, and switch to the Definition title after state
↪ U+005C BACKSLASH (\
)
Ensure a Content token, consume, and switch to the Definition title paren quoted escape state
↪ Anything else
Ensure a Content token and consume
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue an End-of-line token and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Reconsume in the Definition title paren quoted state
↪ U+0029 RIGHT PARENTHESIS ()
)\
↪ U+005C BACKSLASH (\
)
Consume and switch to the Definition title paren quoted state
↪ Anything else
Reconsume in the Definition title paren quoted state
↪ EOF
Enqueue a Content definition end label and enqueue an End-of-file token
↪ EOL
Enqueue a Content definition end label, enqueue an End-of-line token, consume, and switch to the Content start state
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Enqueue a NOK label
↪ EOF
Enqueue an End-of-file token
↪ EOL
Enqueue an End-of-line token, consume, and switch to the Content initial state
↪ Anything else
Consume
The text state machine is used to tokenize the inline constructs part of rich text (such as regular resources and emphasis) or plain text (such as regular character escapes or character references) in a document and must start in the Text start state.
If text is parsed as plain text, the Text start state, Text initial state, and Text state all forward to the Plain text state.
If text is parsed as rich text, an additional variable prev
must be tracked.
Initial set to EOF, it must be set to the input character right before
a character is consumed.
Hookable, there are no regular hooks.
↪ Anything else
Reconsume in the Text initial state
Hookable, there are no regular hooks.
↪ Anything else
Reconsume in the Text state
Hookable, the regular hooks are:
- EOL: End-of-line state
- U+0021 EXCLAMATION MARK (
!
): Image label start state- U+0026 AMPERSAND (
&
): Character reference state- *U+002A ASTERISK (``)*: [Delimiter run asterisk start state*]s-delimiter-run-asterisk-start
- U+003C LESS THAN (
<
): Autolink state- U+003C LESS THAN (
<
): HTML state- U+005B LEFT SQUARE BRACKET (
[
): Link label start state- U+005C BACKSLASH (
\
): Character escape state- U+005C BACKSLASH (
\
): Break escape state- U+005D RIGHT SQUARE BRACKET (
]
): Label resource close state- U+005D RIGHT SQUARE BRACKET (
]
): Label reference close state- U+005D RIGHT SQUARE BRACKET (
]
): Label reference shortcut close state- U+005F UNDERSCORE (
_
): Delimiter run underscore start state- U+0060 GRAVE ACCENT (
`
): Code start state
↪ EOF
Enqueue an End-of-file token
↪ Anything else
Ensure a Content token and consume
Hookable, the regular hooks are:
- EOL: Plain end-of-line state
- U+0026 AMPERSAND (
&
): Character reference state- U+005C BACKSLASH (
\
): Character escape state
↪ EOF
Enqueue an End-of-file token
↪ Anything else
Ensure a Content token and consume
↪ EOL
If the break represented by the character starts with two or more of VIRTUAL SPACE, U+0009 CHARACTER TABULATION (HT), or U+0020 SPACE (SP), enqueue a Hard break label, enqueue an End-of-line token, consume, and switch to the Text initial state
Otherwise, enqueue a Soft break label, enqueue an End-of-line token, consume, and switch to the Text initial state
↪ Anything else
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, consume, and switch to the Text initial state
↪ Anything else
Enqueue a NOK label
↪ U+0021 EXCLAMATION MARK (!
)
Enqueue an Image label start label, enqueue a Marker token, consume, and switch to the Image label start after state
↪ Anything else
Enqueue a NOK label
↪ U+005B LEFT SQUARE BRACKET ([
)
Enqueue a Marker token, consume, enqueue an Image label open label, and switch to the Text state
↪ Anything else
Enqueue a NOK label
↪ U+0026 AMPERSAND (&
)
Enqueue a Character reference start label, enqueue a Marker token, consume, and switch to the Character reference start after state
↪ Anything else
Enqueue a NOK label
↪ U+0023 NUMBER SIGN (#
)
Enqueue a Marker token, consume, and switch to the Character reference numeric state
Let entityName
be the empty string, append the character to entityName
,
enqueue a Content token, consume, and switch to the Character reference named state
↪ Anything else
Enqueue a NOK label
Note: This state can be optimized by either imposing a maximum size (the size of the longest possible named character reference) or by using a trie of the possible named character references.
↪ U+003B SEMICOLON (;
)
If entityName
is a character reference name, unset entityName
,
enqueue a Marker token, consume, enqueue a Character reference end label, and switch
to the Text state
Otherwise, treat it as per the “anything else” entry below
Append the character to entityName
and consume
↪ Anything else
Unset entityName
and enqueue a NOK label
↪ U+0058 (X
)\
↪ U+0078 (x
)
Let characterReferenceCode
be 0
, enqueue a Marker token, consume, and switch
to the Character reference hexadecimal start state
Let characterReferenceCode
be 0
, enqueue a Content token and reconsume in
the Character reference decimal state
↪ Anything else
Enqueue a NOK label
Enqueue a Content token and reconsume in the Character reference hexadecimal state
↪ Anything else
Unset characterReferenceCode
and enqueue a NOK label
Note: This state can be optimized by imposing a maximum size (the size of the longest possible valid hexadecimal character reference, 6).
↪ U+003B SEMICOLON (;
)
Unset characterReferenceCode
, enqueue a Marker token, consume, enqueue a
Character reference end label, and switch to the Text state
Multiply characterReferenceCode
by 0x10
, add the digitized
input character to characterReferenceCode
, and consume
Multiply characterReferenceCode
by 0x10
, add the digitized
input character to characterReferenceCode
, and consume
Multiply characterReferenceCode
by 0x10
, add the digitized
input character to characterReferenceCode
, and consume
↪ Anything else
Unset characterReferenceCode
and enqueue a NOK label
Note: This state can be optimized by imposing a maximum size (the size of the longest possible valid decimal character reference, 7).
↪ U+003B SEMICOLON (;
)
Unset characterReferenceCode
, enqueue a Marker token, consume, enqueue a
Character reference end label, and switch to the Text state
Multiply characterReferenceCode
by 10
, add the digitized
input character to characterReferenceCode
, and consume
↪ Anything else
Unset characterReferenceCode
and enqueue a NOK label
↪ *U+002A ASTERISK (``)**
Let delimiterRunAfter
be null
and let delimiterRunBefore
be
'whitespace'
if prev
is EOF, EOL, or Unicode whitespace,
'punctuation'
if it is Unicode punctuation, or null
otherwise
Enqueue a Delimiter run start label, enqueue a Sequence token, consume, and switch to the Delimiter run asterisk state
↪ Anything else
Enqueue a NOK label
↪ EOF\ ↪ EOL\ ↪ Unicode whitespace
Let delimiterRunAfter
be 'whitespace'
and treat it as per the “anything
else” entry below
↪ *U+002A ASTERISK (``)**
Consume
Let delimiterRunAfter
be 'punctuation'
and treat it as per the “anything
else” entry below
↪ Anything else
Let leftFlanking
be whether both delimiterRunAfter
is not
'whitespace'
, and that either delimiterRunAfter
is not 'punctuation'
or that delimiterRunBefore
is not null
Let rightFlanking
be whether both delimiterRunBefore
is not
'whitespace'
, and that either delimiterRunBefore
is not 'punctuation'
or that delimiterRunAfter
is not null
Unset delimiterRunBefore
, unset delimiterRunAfter
, enqueue a
Delimiter run end label, and reconsume in the Text state
↪ U+003C LESS THAN (<
)
Enqueue an Autolink start label, enqueue a Marker token, consume, enqueue an Autolink open label, and switch to the Autolink open state
↪ Anything else
Enqueue a NOK label
Consume, let sizeScheme
be 1
, and switch to the
Autolink scheme or email atext state
↪ atext\
↪ U+002E DOT (.
)
Consume and switch to the Autolink email atext state
↪ Anything else
Enqueue a NOK label
↪ U+0040 AT SIGN (@
)
Enqueue a Marker token, consume, let sizeLabel
be 1
, enqueue a Content token,
and switch to the Autolink email at sign or dot state
↪ atext\
↪ U+002E DOT (.
)
Consume
↪ Anything else
Enqueue a NOK label
↪ U+002D DASH (-
)
If sizeLabel
is not 63
, increment sizeLabel
by 1
, consume, and
switch to the Autolink email dash state
Otherwise, treat it as per the “anything else” entry below
↪ U+002E DOT (.
)
If sizeLabel
is not 63
, increment sizeLabel
by 1
, consume, and
switch to the Autolink email at sign or dot state
Otherwise, treat it as per the “anything else” entry below
↪ U+003E GREATER THAN (>
)
Unset sizeLabel
, enqueue an Autolink email close label, enqueue a Marker token,
consume, enqueue an Autolink email end label, and switch to the Text state
If sizeLabel
is not 63
, increment sizeLabel
by 1
and consume
Otherwise, treat it as per the “anything else” entry below
↪ Anything else
Unset sizeLabel
and enqueue a NOK label
If sizeLabel
is not 63
, increment sizeLabel
by 1
, consume, and
switch to the Autolink email label state
Otherwise, treat it as per the “anything else” entry below
↪ Anything else
Unset sizeLabel
and enqueue a NOK label
↪ U+002D DASH (-
)
If sizeLabel
is not 63
, increment sizeLabel
by 1
and consume
Otherwise, treat it as per the “anything else” entry below
If sizeLabel
is not 63
, increment sizeLabel
by 1
, consume, and
switch to the Autolink email label state
Otherwise, treat it as per the “anything else” entry below
↪ Anything else
Unset sizeLabel
and enqueue a NOK label
↪ U+002B PLUS SIGN (+
)\
↪ U+002E DOT (.
)\
↪ U+002D DASH (-
)\
↪ ASCII alphanumeric
Increment sizeScheme
by 1
, consume, and switch to the
Autolink scheme inside or email atext state
↪ atext
Unset sizeScheme
, consume, and switch to the Autolink email atext state
↪ Anything else
Unset sizeScheme
and enqueue a NOK label
↪ U+003A COLON (:
)
Unset sizeScheme
, enqueue a Marker token, consume, enqueue a Content token, and
switch to the Autolink URI inside state
↪ U+0040 AT SIGN (@
)
Unset sizeScheme
, enqueue a Marker token, consume, let sizeLabel
be 1
,
enqueue a Content token, and switch to the Autolink email at sign or dot state
↪ U+002B PLUS SIGN (+
)\
↪ U+002E DOT (.
)\
↪ U+002D DASH (-
)\
↪ ASCII alphanumeric
If sizeScheme
is not 32
, increment sizeScheme
by 1
and consume
Otherwise, treat it as per the “atext” entry below
↪ atext
Unset sizeScheme
, consume, and switch to the Autolink email atext state
↪ Anything else
Unset sizeScheme
and enqueue a NOK label
↪ EOF\
↪ EOL\
↪ ASCII control\
↪ U+0020 SPACE (SP)\
↪ U+003C LESS THAN (<
)
Enqueue a NOK label
↪ U+003E GREATER THAN (>
)
Enqueue an Autolink uri close label, enqueue a Marker token, consume, enqueue an Autolink uri end label, and switch to the Text state
↪ Anything else
Consume
↪ U+003C LESS THAN (<
)
Enqueue an HTML start label, enqueue a Content token, consume, and switch to the HTML open state
↪ Anything else
Enqueue a NOK label
↪ U+0021 EXCLAMATION MARK (!
)
Consume and switch to the HTML declaration start state
↪ U+002F SLASH (/
)
Consume and switch to the HTML tag close start state
↪ U+003F QUESTION MARK (?
)
Consume and switch to the HTML instruction inside state
Consume and switch to the HTML tag open inside state
↪ Anything else
Enqueue a NOK label
↪ U+002D DASH (-
)
Consume and switch to the HTML comment open inside state
↪ [CDATA[
(the five upper letters “CDATA” with a U+005B LEFT SQUARE BRACKET ([
) before and
after)
Consume and switch to the HTML CDATA inside state
Reconsume in the HTML declaration inside state
↪ Anything else
Enqueue a NOK label
↪ U+002D DASH (-
)
Consume and switch to the HTML comment inside state
↪ Anything else
Enqueue a NOK label
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, consume, and enqueue a Content token
↪ U+002D DASH (-
)
Consume and switch to the HTML comment close inside state
↪ Anything else
Consume
↪ U+002D DASH (-
)
Consume and switch to the HTML comment close state
↪ Anything else
Reconsume in the HTML comment inside state
Note: a CM comment may not contain two dashes (
--
), and may not end in a dash (which would result in--->
). Here we have seen two dashes, so we can either be at the end of a comment, or no longer in a comment.
↪ U+003E GREATER THAN (>
)
Consume, enqueue an HTML end label, and switch to the Text state
↪ Anything else
Enqueue a NOK label
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, consume, and enqueue a Content token
↪ ]]>
(two of U+005D RIGHT SQUARE BRACKET (]
), with a U+003E GREATER THAN (>
) after)
Consume, enqueue an HTML end label, and switch to the Text state
↪ Anything else
Consume
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, consume, and enqueue a Content token
↪ U+003E GREATER THAN (>
)
Consume, enqueue an HTML end label, and switch to the Text state
↪ Anything else
Consume
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, consume, and enqueue a Content token
↪ U+003F QUESTION MARK (?
)
Consume and switch to the HTML instruction close state
↪ Anything else
Consume
↪ U+003E GREATER THAN (>
)
Consume, enqueue an HTML end label, and switch to the Text state
↪ Anything else
Reconsume in the HTML instruction inside state
Consume and switch to the HTML tag close inside state
↪ Anything else
Enqueue a NOK label
↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Reconsume in the HTML tag close between state
↪ U+003E GREATER THAN (>
)
Consume, enqueue an HTML end label, and switch to the Text state
↪ ASCII alphanumeric\
↪ U+002D DASH (-
)
Consume
↪ Anything else
Enqueue a NOK label
Note: an EOL is technically allowed here, but as a
>
after an EOL would start a blockquote, practically it’s not possible.
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, consume, and enqueue a Content token
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Consume
↪ U+003E GREATER THAN (>
)
Consume, enqueue an HTML end label, and switch to the Text state
↪ Anything else
Enqueue a NOK label
↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Reconsume in the HTML tag open between state
↪ U+003E GREATER THAN (>
)
Consume, enqueue an HTML end label, and switch to the Text state
↪ ASCII alphanumeric\
↪ U+002D DASH (-
)
Consume
↪ Anything else
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, consume, and enqueue a Content token
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Consume
↪ U+002F SLASH (/
)
Consume and switch to the HTML tag open self closing state
↪ U+003E GREATER THAN (>
)
Consume, enqueue an HTML end label, and switch to the Text state
↪ ASCII alpha\
↪ U+003A COLON (:
)\
↪ U+005F UNDERSCORE (_
)
Consume and switch to the HTML tag open attribute name state
↪ Anything else
Enqueue a NOK label
↪ U+003E GREATER THAN (>
)
Consume, enqueue an HTML end label, and switch to the Text state
↪ Anything else
Enqueue a NOK label
↪ ASCII alphanumeric\
↪ U+002D DASH (-
)\
↪ U+002E DOT (.
)\
↪ U+003A COLON (:
)\
↪ U+005F UNDERSCORE (_
)
Consume
↪ Anything else
Reconsume in the HTML tag open attribute name after state
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, consume, and enqueue a Content token
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Consume
↪ U+002F SLASH (/
)
Consume and switch to the HTML tag open self closing state
↪ U+003D EQUALS TO (=
)
Consume and switch to the HTML tag open attribute before state
↪ U+003E GREATER THAN (>
)
Consume, enqueue an HTML end label, and switch to the Text state
↪ Anything else
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, consume, and enqueue a Content token
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Consume
↪ U+0022 QUOTATION MARK ("
)
Consume and switch to the HTML tag open double quoted attribute state
↪ U+0027 APOSTROPHE ('
)
Consume and switch to the HTML tag open single quoted attribute state
↪ U+003C LESS THAN (<
)\
↪ U+003D EQUALS TO (=
)\
↪ U+003E GREATER THAN (>
)\
↪ U+0060 GRAVE ACCENT (`
)
Enqueue a NOK label
↪ Anything else
Consume and switch to the HTML tag open unquoted attribute state
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, consume, and enqueue a Content token
↪ U+0022 QUOTATION MARK ("
)
Consume and switch to the HTML tag open between state
↪ Anything else
Consume
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, consume, and enqueue a Content token
↪ U+0027 APOSTROPHE ('
)
Consume and switch to the HTML tag open between state
↪ Anything else
Consume
↪ EOF\
↪ U+0022 QUOTATION MARK ("
)\
↪ U+0027 APOSTROPHE ('
)\
↪ U+003C LESS THAN (<
)\
↪ U+003D EQUALS TO (=
)\
↪ U+0060 GRAVE ACCENT (`
)
Enqueue a NOK label
↪ EOL\
↪ VIRTUAL SPACE\
↪ U+0009 CHARACTER TABULATION (HT)\
↪ U+0020 SPACE (SP)\
↪ U+003E GREATER THAN (>
)
Reconsume in the HTML tag open between state
↪ Anything else
Consume
↪ U+005B LEFT SQUARE BRACKET ([
)
Enqueue a Link label start label, enqueue a Marker token, consume, enqueue a Link label open label, and switch to the Text state
↪ Anything else
Enqueue a NOK label
↪ U+005C BACKSLASH (\
)
Enqueue a Character escape start label, enqueue a Marker token, consume, and switch to the Character escape after state
↪ Anything else
Enqueue a NOK label
Enqueue a Content token, consume, enqueue a Character escape end label, and switch to the Text state
↪ Anything else
Enqueue a NOK label
↪ U+005C BACKSLASH (\
)
Enqueue a Break escape start label, enqueue a Hard break label, enqueue a Marker token, consume, and switch to the Break escape after state
↪ Anything else
Enqueue a NOK label
↪ EOL
If the break represented by the character does not start with a VIRTUAL SPACE, U+0009 CHARACTER TABULATION (HT), or U+0020 SPACE (SP), enqueue an End-of-line token, consume, enqueue a Break escape end label, and switch to the Text initial state
Otherwise, treat it as per the “anything else” entry below
↪ Anything else
Enqueue a NOK label
↪ U+005D RIGHT SQUARE BRACKET (]
)
Enqueue a Label close label, enqueue a Marker token, consume, enqueue a Label end label, and switch to the Label resource end after state
↪ Anything else
Enqueue a NOK label
↪ U+0028 LEFT PARENTHESIS ((
)
Enqueue a Resource information start label, enqueue a Marker token, consume, enqueue a Resource information open label, and switch to the Resource information open state
↪ Anything else
Enqueue a NOK label
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+0029 RIGHT PARENTHESIS ()
)
Enqueue a Resource information close label, enqueue a Marker token, consume, enqueue a Resource information end label, and switch to the Text state
↪ U+003C LESS THAN (<
)
Enqueue a Resource information destination quoted start label, enqueue a Marker token, consume, enqueue a Resource information destination quoted open label, and switch to the Resource information destination quoted inside state
Enqueue a NOK label
↪ Anything else
Enqueue a Resource information destination unquoted start label, enqueue a Content token, consume, enqueue a Resource information destination unquoted open label, and switch to the Resource information destination unquoted inside state
↪ EOF\
↪ EOL\
↪ U+003C LESS THAN (<
)
Enqueue a NOK label
↪ U+003E GREATER THAN (>
)
Enqueue a Resource information destination quoted close label, enqueue a Marker token, consume, enqueue a Resource information destination quoted end label, and switch to the Resource information destination quoted after state
↪ U+005C BACKSLASH (\
)
Ensure a Content token, consume, and switch to the Resource information destination quoted escape state
↪ Anything else
Ensure a Content token and consume
↪ U+003C LESS THAN (<
)\
↪ U+003E GREATER THAN (>
)\
↪ U+005C BACKSLASH (\
)
Consume and switch to the Resource information destination quoted inside state
↪ Anything else
Reconsume in the Resource information destination quoted inside state
↪ EOL\
↪ VIRTUAL SPACE\
↪ U+0009 CHARACTER TABULATION (HT)\
↪ U+0020 SPACE (SP)\
↪ U+003E GREATER THAN (>
)
Reconsume in the Resource information between state
↪ U+0029 RIGHT PARENTHESIS ()
)
Enqueue a Resource information close label, enqueue a Marker token, consume, enqueue a Resource information end label, and switch to the Text state
↪ Anything else
Enqueue a NOK label
↪ EOF
Unset balance
and enqueue a NOK label
↪ EOL\
↪ VIRTUAL SPACE\
↪ U+0009 CHARACTER TABULATION (HT)\
↪ U+0020 SPACE (SP)\
↪ U+003E GREATER THAN (>
)
Enqueue a Resource information destination unquoted close label, enqueue a Resource information destination unquoted end label, and reconsume in the Resource information between state
↪ U+0028 LEFT PARENTHESIS ((
)
Increment balance
by 1
and consume
↪ U+0029 RIGHT PARENTHESIS ()
)
If balance
is 0
, unset balance
, enqueue a
Resource information destination unquoted close label, enqueue a
Resource information destination unquoted end label, and reconsume in the
Resource information between state
Otherwise, decrement balance
by 1
, and consume
↪ U+005C BACKSLASH (\
)
Consume and switch to the Resource information destination unquoted escape state
Unset balance
and enqueue a NOK label
↪ Anything else
Consume
↪ U+0028 LEFT PARENTHESIS ((
)\
↪ U+0029 RIGHT PARENTHESIS ()
)\
↪ U+005C BACKSLASH (\
)
Consume and switch to the Resource information destination unquoted inside state
↪ Anything else
Reconsume in the Resource information destination unquoted inside state
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+0022 QUOTATION MARK ("
)
Enqueue a Resource information title start label, enqueue a Marker token, consume, enqueue a Resource information title open label, and switch to the Resource information title double quoted inside state
↪ U+0027 APOSTROPHE ('
)
Enqueue a Resource information title start label, enqueue a Marker token, consume, enqueue a Resource information title open label, and switch to the Resource information title single quoted inside state
↪ U+0028 LEFT PARENTHESIS ((
)
Enqueue a Resource information title start label, enqueue a Marker token, consume, enqueue a Resource information title open label, and switch to the Resource information title paren quoted inside state
↪ U+0029 RIGHT PARENTHESIS ()
)
Enqueue a Resource information close label, enqueue a Marker token, consume, enqueue a Resource information end label, and switch to the Text state
↪ Anything else
Enqueue a NOK label
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, and consume
↪ U+0022 QUOTATION MARK ("
)
Enqueue a Resource information title close label, enqueue a Marker token, consume, enqueue a Resource information title end label, and switch to the Resource information title after state
↪ U+005C BACKSLASH (\
)
Ensure a Content token, consume, and switch to the Resource information title double quoted escape state
↪ Anything else
Ensure a Content token and consume
↪ U+0022 QUOTATION MARK ("
)\
↪ U+005C BACKSLASH (\
)
Consume and switch to the Resource information title double quoted inside state
↪ Anything else
Reconsume in the Resource information title double quoted inside state
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, and consume
↪ U+0027 APOSTROPHE ('
)
Enqueue a Resource information title close label, enqueue a Marker token, consume, enqueue a Resource information title end label, and switch to the Resource information title after state
↪ U+005C BACKSLASH (\
)
Ensure a Content token, consume, and switch to the Resource information title single quoted escape state
↪ Anything else
Ensure a Content token and consume
↪ U+0027 APOSTROPHE ('
)\
↪ U+005C BACKSLASH (\
)
Consume and switch to the Resource information title single quoted inside state
↪ Anything else
Reconsume in the Resource information title single quoted inside state
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, and consume
↪ U+0029 RIGHT PARENTHESIS ()
)
Enqueue a Resource information title close label, enqueue a Marker token, consume, enqueue a Resource information title end label, and switch to the Resource information title after state
↪ U+005C BACKSLASH (\
)
Ensure a Content token, consume, and switch to the Resource information title paren quoted escape state
↪ Anything else
Ensure a Content token and consume
↪ U+0029 RIGHT PARENTHESIS ()
)\
↪ U+005C BACKSLASH (\
)
Consume and switch to the Resource information title paren quoted inside state
↪ Anything else
Reconsume in the Resource information title paren quoted inside state
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+0029 RIGHT PARENTHESIS ()
)
Enqueue a Resource information close label, enqueue a Marker token, consume, enqueue a Resource information end label, and switch to the Text state
↪ Anything else
Enqueue a NOK label
↪ U+005D RIGHT SQUARE BRACKET (]
)
Enqueue a Label close label, enqueue a Marker token, consume, enqueue a Label end label, and switch to the Label reference end after state
↪ Anything else
Enqueue a NOK label
↪ U+005B LEFT SQUARE BRACKET ([
)
Enqueue a Reference label start label, enqueue a Marker token, consume, enqueue a Reference label open label, and switch to the Reference label open state
↪ Anything else
Enqueue a NOK label
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ U+005D RIGHT SQUARE BRACKET (]
)
Enqueue a Reference label collapsed close label, enqueue a Marker token, consume, enqueue a Reference label end label, and switch to the Text state
↪ Anything else
Enqueue a Content token, consume, and switch to the Reference label inside state
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Reconsume in the Reference label between state
↪ U+005C BACKSLASH (\
)
Ensure a Content token, consume, and switch to the Reference label escape state
↪ U+005D RIGHT SQUARE BRACKET (]
)
Enqueue a Reference label full close label, enqueue a Marker token, consume, enqueue a Reference label end label, and switch to the Text state
↪ Anything else
Ensure a Content token and consume
↪ EOF
Enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Reconsume in the Reference label inside state
↪ U+005C BACKSLASH (\
)\
↪ U+005D RIGHT SQUARE BRACKET (]
)
Consume and switch to the Reference label inside state
↪ Anything else
Reconsume in the Reference label inside state
↪ U+005D RIGHT SQUARE BRACKET (]
)
Enqueue a Label close label, enqueue a Marker token, consume, enqueue a Label end label, and switch to the Text state
↪ Anything else
Enqueue a NOK label
↪ U+005F UNDERSCORE (_
)
Let delimiterRunAfter
be null
and let delimiterRunBefore
be
'whitespace'
if prev
is EOF, EOL, or Unicode whitespace,
'punctuation'
if it is Unicode punctuation, or null
otherwise
Enqueue a Delimiter run start label, enqueue a Sequence token, consume, and switch to the Delimiter run underscore state
↪ Anything else
Enqueue a NOK label
↪ EOF\ ↪ EOL\ ↪ Unicode whitespace
Let delimiterRunAfter
be 'whitespace'
and treat it as per the “anything
else” entry below
↪ U+005F UNDERSCORE (_
)
Consume
Let delimiterRunAfter
be 'punctuation'
and treat it as per the “anything
else” entry below
↪ Anything else
Let leftFlanking
be whether both delimiterRunAfter
is not
'whitespace'
, and that either delimiterRunAfter
is not 'punctuation'
or that delimiterRunBefore
is not null
Let rightFlanking
be whether both delimiterRunBefore
is not
'whitespace'
, and that either delimiterRunBefore
is not 'punctuation'
or that delimiterRunAfter
is not null
Unset delimiterRunBefore
, unset delimiterRunAfter
, enqueue a
Delimiter run end label, and reconsume in the Text state
↪ U+0060 GRAVE ACCENT (`
)
Let sizeOpen
be 1
, enqueue a Code start label, enqueue a Code fence start label,
enqueue a Marker token, consume, and switch to the Code open fence inside state
↪ Anything else
Enqueue a NOK label
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Enqueue a Code fence end label and reconsume in the Code between state
↪ U+0060 GRAVE ACCENT (`
)
Increment sizeOpen
by 1
and consume
↪ Anything else
Enqueue a Code fence end label, enqueue a Content token, consume, and switch to the Code inside state
↪ EOF
Unset sizeOpen
and enqueue a NOK label
↪ EOL
Enqueue a Soft break label, enqueue an End-of-line token, and consume
↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Ensure a Whitespace token and consume
↪ Anything else
Reconsume in the Code inside state
↪ EOF\ ↪ EOL\ ↪ VIRTUAL SPACE\ ↪ U+0009 CHARACTER TABULATION (HT)\ ↪ U+0020 SPACE (SP)
Reconsume in the Code between state
↪ U+0060 GRAVE ACCENT (`
)
Let sizeClose
be 1
, enqueue a Code fence start label, enqueue a Marker token,
consume, and switch to the Code close fence inside state
↪ Anything else
Ensure a Content token and consume
↪ U+0060 GRAVE ACCENT (`
)
Increment sizeClose
by 1
and consume
↪ Anything else
If sizeOpen
is sizeClose
, unset sizeOpen
, unset sizeClose
, enqueue a
Code fence end label, enqueue a Code end label, and reconsume in the Text state
Otherwise, unset sizeClose
and reconsume in the Code inside state
A Whitespace token represents inline whitespace that is part of syntax instead of content.
interface Whitespace <: Token {
size: number
used: number
characters: [Character]
}
{
type: 'whitespace',
characters: [9],
size: 3,
used: 0
}
A Line terminator token represents a line break.
interface LineEnding <: Token {}
{type: 'lineEnding'}
An End-of-file token represents the end of the syntax.
interface EndOfFile <: Token {}
{type: 'endOfFile'}
An End-of-line token represents a point between two runs of text in content.
interface EndOfLine <: Token {}
{type: 'endOfLine'}
A Marker token represents one punctuation character that is part of syntax instead of content.
interface Marker <: Token {}
{type: 'marker'}
A Sequence token represents one or more of the same punctuation characters that are part of syntax instead of content.
interface Sequence <: Token {
size: number
}
{type: 'sequence', size: 3}
A Content token represents content.
interface Content <: Token {
prefix: string
}
{type: 'content', prefix: ' '}
A raw tag is one of: script
, pre
, and style
.
A basic tag is one of: address
, article
, aside
, base
, basefont
,
blockquote
, body
, caption
, center
, col
, colgroup
, dd
, details
,
dialog
, dir
, div
, dl
, dt
, fieldset
, figcaption
, figure
,
footer
, form
, frame
, frameset
, h1
, h2
, h3
, h4
, h5
, h6
,
head
, header
, hr
, html
, iframe
, legend
, li
, link
, main
,
menu
, menuitem
, nav
, noframes
, ol
, optgroup
, option
, p
,
param
, section
, source
, summary
, table
, tbody
, td
, tfoot
, th
,
thead
, title
, tr
, track
, and ul
.
A character reference name is one of:
AEli
, AElig
, AM
, AMP
, Aacut
, Aacute
,
Abreve
, Acir
, Acirc
, Acy
, Afr
, Agrav
, Agrave
, Alpha
, Amacr
,
And
, Aogon
, Aopf
, ApplyFunction
, Arin
, Aring
, Ascr
, Assign
,
Atild
, Atilde
, Aum
, Auml
, Backslash
, Barv
, Barwed
, Bcy
,
Because
, Bernoullis
, Beta
, Bfr
, Bopf
, Breve
, Bscr
, Bumpeq
,
CHcy
, COP
, COPY
, Cacute
, Cap
, CapitalDifferentialD
, Cayleys
,
Ccaron
, Ccedi
, Ccedil
, Ccirc
, Cconint
, Cdot
, Cedilla
, CenterDot
,
Cfr
, Chi
, CircleDot
, CircleMinus
, CirclePlus
, CircleTimes
,
ClockwiseContourIntegral
, CloseCurlyDoubleQuote
, CloseCurlyQuote
, Colon
,
Colone
, Congruent
, Conint
, ContourIntegral
, Copf
, Coproduct
,
CounterClockwiseContourIntegral
, Cross
, Cscr
, Cup
, CupCap
, DD
,
DDotrahd
, DJcy
, DScy
, DZcy
, Dagger
, Darr
, Dashv
, Dcaron
, Dcy
,
Del
, Delta
, Dfr
, DiacriticalAcute
, DiacriticalDot
,
DiacriticalDoubleAcute
, DiacriticalGrave
, DiacriticalTilde
, Diamond
,
DifferentialD
, Dopf
, Dot
, DotDot
, DotEqual
, DoubleContourIntegral
,
DoubleDot
, DoubleDownArrow
, DoubleLeftArrow
, DoubleLeftRightArrow
,
DoubleLeftTee
, DoubleLongLeftArrow
, DoubleLongLeftRightArrow
,
DoubleLongRightArrow
, DoubleRightArrow
, DoubleRightTee
, DoubleUpArrow
,
DoubleUpDownArrow
, DoubleVerticalBar
, DownArrow
, DownArrowBar
,
DownArrowUpArrow
, DownBreve
, DownLeftRightVector
, DownLeftTeeVector
,
DownLeftVector
, DownLeftVectorBar
, DownRightTeeVector
, DownRightVector
,
DownRightVectorBar
, DownTee
, DownTeeArrow
, Downarrow
, Dscr
, Dstrok
,
ENG
, ET
, ETH
, Eacut
, Eacute
, Ecaron
, Ecir
, Ecirc
, Ecy
, Edot
,
Efr
, Egrav
, Egrave
, Element
, Emacr
, EmptySmallSquare
,
EmptyVerySmallSquare
, Eogon
, Eopf
, Epsilon
, Equal
, EqualTilde
,
Equilibrium
, Escr
, Esim
, Eta
, Eum
, Euml
, Exists
, ExponentialE
,
Fcy
, Ffr
, FilledSmallSquare
, FilledVerySmallSquare
, Fopf
, ForAll
,
Fouriertrf
, Fscr
, G
, GJcy
, GT
, Gamma
, Gammad
, Gbreve
, Gcedil
,
Gcirc
, Gcy
, Gdot
, Gfr
, Gg
, Gopf
, GreaterEqual
, GreaterEqualLess
,
GreaterFullEqual
, GreaterGreater
, GreaterLess
, GreaterSlantEqual
,
GreaterTilde
, Gscr
, Gt
, HARDcy
, Hacek
, Hat
, Hcirc
, Hfr
,
HilbertSpace
, Hopf
, HorizontalLine
, Hscr
, Hstrok
, HumpDownHump
,
HumpEqual
, IEcy
, IJlig
, IOcy
, Iacut
, Iacute
, Icir
, Icirc
, Icy
,
Idot
, Ifr
, Igrav
, Igrave
, Im
, Imacr
, ImaginaryI
, Implies
, Int
,
Integral
, Intersection
, InvisibleComma
, InvisibleTimes
, Iogon
, Iopf
,
Iota
, Iscr
, Itilde
, Iukcy
, Ium
, Iuml
, Jcirc
, Jcy
, Jfr
, Jopf
,
Jscr
, Jsercy
, Jukcy
, KHcy
, KJcy
, Kappa
, Kcedil
, Kcy
, Kfr
,
Kopf
, Kscr
, L
, LJcy
, LT
, Lacute
, Lambda
, Lang
, Laplacetrf
,
Larr
, Lcaron
, Lcedil
, Lcy
, LeftAngleBracket
, LeftArrow
,
LeftArrowBar
, LeftArrowRightArrow
, LeftCeiling
, LeftDoubleBracket
,
LeftDownTeeVector
, LeftDownVector
, LeftDownVectorBar
, LeftFloor
,
LeftRightArrow
, LeftRightVector
, LeftTee
, LeftTeeArrow
, LeftTeeVector
,
LeftTriangle
, LeftTriangleBar
, LeftTriangleEqual
, LeftUpDownVector
,
LeftUpTeeVector
, LeftUpVector
, LeftUpVectorBar
, LeftVector
,
LeftVectorBar
, Leftarrow
, Leftrightarrow
, LessEqualGreater
,
LessFullEqual
, LessGreater
, LessLess
, LessSlantEqual
, LessTilde
,
Lfr
, Ll
, Lleftarrow
, Lmidot
, LongLeftArrow
, LongLeftRightArrow
,
LongRightArrow
, Longleftarrow
, Longleftrightarrow
, Longrightarrow
,
Lopf
, LowerLeftArrow
, LowerRightArrow
, Lscr
, Lsh
, Lstrok
, Lt
,
Map
, Mcy
, MediumSpace
, Mellintrf
, Mfr
, MinusPlus
, Mopf
, Mscr
,
Mu
, NJcy
, Nacute
, Ncaron
, Ncedil
, Ncy
, NegativeMediumSpace
,
NegativeThickSpace
, NegativeThinSpace
, NegativeVeryThinSpace
,
NestedGreaterGreater
, NestedLessLess
, NewLine
, Nfr
, NoBreak
,
NonBreakingSpace
, Nopf
, Not
, NotCongruent
, NotCupCap
,
NotDoubleVerticalBar
, NotElement
, NotEqual
, NotEqualTilde
, NotExists
,
NotGreater
, NotGreaterEqual
, NotGreaterFullEqual
, NotGreaterGreater
,
NotGreaterLess
, NotGreaterSlantEqual
, NotGreaterTilde
, NotHumpDownHump
,
NotHumpEqual
, NotLeftTriangle
, NotLeftTriangleBar
, NotLeftTriangleEqual
,
NotLess
, NotLessEqual
, NotLessGreater
, NotLessLess
, NotLessSlantEqual
,
NotLessTilde
, NotNestedGreaterGreater
, NotNestedLessLess
, NotPrecedes
,
NotPrecedesEqual
, NotPrecedesSlantEqual
, NotReverseElement
,
NotRightTriangle
, NotRightTriangleBar
, NotRightTriangleEqual
,
NotSquareSubset
, NotSquareSubsetEqual
, NotSquareSuperset
,
NotSquareSupersetEqual
, NotSubset
, NotSubsetEqual
, NotSucceeds
,
NotSucceedsEqual
, NotSucceedsSlantEqual
, NotSucceedsTilde
, NotSuperset
,
NotSupersetEqual
, NotTilde
, NotTildeEqual
, NotTildeFullEqual
,
NotTildeTilde
, NotVerticalBar
, Nscr
, Ntild
, Ntilde
, Nu
, OElig
,
Oacut
, Oacute
, Ocir
, Ocirc
, Ocy
, Odblac
, Ofr
, Ograv
, Ograve
,
Omacr
, Omega
, Omicron
, Oopf
, OpenCurlyDoubleQuote
, OpenCurlyQuote
,
Or
, Oscr
, Oslas
, Oslash
, Otild
, Otilde
, Otimes
, Oum
, Ouml
,
OverBar
, OverBrace
, OverBracket
, OverParenthesis
, PartialD
, Pcy
,
Pfr
, Phi
, Pi
, PlusMinus
, Poincareplane
, Popf
, Pr
, Precedes
,
PrecedesEqual
, PrecedesSlantEqual
, PrecedesTilde
, Prime
, Product
,
Proportion
, Proportional
, Pscr
, Psi
, QUO
, QUOT
, Qfr
, Qopf
,
Qscr
, RBarr
, RE
, REG
, Racute
, Rang
, Rarr
, Rarrtl
, Rcaron
,
Rcedil
, Rcy
, Re
, ReverseElement
, ReverseEquilibrium
,
ReverseUpEquilibrium
, Rfr
, Rho
, RightAngleBracket
, RightArrow
,
RightArrowBar
, RightArrowLeftArrow
, RightCeiling
, RightDoubleBracket
,
RightDownTeeVector
, RightDownVector
, RightDownVectorBar
, RightFloor
,
RightTee
, RightTeeArrow
, RightTeeVector
, RightTriangle
,
RightTriangleBar
, RightTriangleEqual
, RightUpDownVector
,
RightUpTeeVector
, RightUpVector
, RightUpVectorBar
, RightVector
,
RightVectorBar
, Rightarrow
, Ropf
, RoundImplies
, Rrightarrow
, Rscr
,
Rsh
, RuleDelayed
, SHCHcy
, SHcy
, SOFTcy
, Sacute
, Sc
, Scaron
,
Scedil
, Scirc
, Scy
, Sfr
, ShortDownArrow
, ShortLeftArrow
,
ShortRightArrow
, ShortUpArrow
, Sigma
, SmallCircle
, Sopf
, Sqrt
,
Square
, SquareIntersection
, SquareSubset
, SquareSubsetEqual
,
SquareSuperset
, SquareSupersetEqual
, SquareUnion
, Sscr
, Star
, Sub
,
Subset
, SubsetEqual
, Succeeds
, SucceedsEqual
, SucceedsSlantEqual
,
SucceedsTilde
, SuchThat
, Sum
, Sup
, Superset
, SupersetEqual
,
Supset
, THOR
, THORN
, TRADE
, TSHcy
, TScy
, Tab
, Tau
, Tcaron
,
Tcedil
, Tcy
, Tfr
, Therefore
, Theta
, ThickSpace
, ThinSpace
,
Tilde
, TildeEqual
, TildeFullEqual
, TildeTilde
, Topf
, TripleDot
,
Tscr
, Tstrok
, Uacut
, Uacute
, Uarr
, Uarrocir
, Ubrcy
, Ubreve
,
Ucir
, Ucirc
, Ucy
, Udblac
, Ufr
, Ugrav
, Ugrave
, Umacr
, UnderBar
,
UnderBrace
, UnderBracket
, UnderParenthesis
, Union
, UnionPlus
, Uogon
,
Uopf
, UpArrow
, UpArrowBar
, UpArrowDownArrow
, UpDownArrow
,
UpEquilibrium
, UpTee
, UpTeeArrow
, Uparrow
, Updownarrow
,
UpperLeftArrow
, UpperRightArrow
, Upsi
, Upsilon
, Uring
, Uscr
,
Utilde
, Uum
, Uuml
, VDash
, Vbar
, Vcy
, Vdash
, Vdashl
, Vee
,
Verbar
, Vert
, VerticalBar
, VerticalLine
, VerticalSeparator
,
VerticalTilde
, VeryThinSpace
, Vfr
, Vopf
, Vscr
, Vvdash
, Wcirc
,
Wedge
, Wfr
, Wopf
, Wscr
, Xfr
, Xi
, Xopf
, Xscr
, YAcy
, YIcy
,
YUcy
, Yacut
, Yacute
, Ycirc
, Ycy
, Yfr
, Yopf
, Yscr
, Yuml
,
ZHcy
, Zacute
, Zcaron
, Zcy
, Zdot
, ZeroWidthSpace
, Zeta
, Zfr
,
Zopf
, Zscr
, aacut
, aacute
, abreve
, ac
, acE
, acd
, acir
,
acirc
, acut
, acute
, acy
, aeli
, aelig
, af
, afr
, agrav
,
agrave
, alefsym
, aleph
, alpha
, am
, amacr
, amalg
, amp
, and
,
andand
, andd
, andslope
, andv
, ang
, ange
, angle
, angmsd
,
angmsdaa
, angmsdab
, angmsdac
, angmsdad
, angmsdae
, angmsdaf
,
angmsdag
, angmsdah
, angrt
, angrtvb
, angrtvbd
, angsph
, angst
,
angzarr
, aogon
, aopf
, ap
, apE
, apacir
, ape
, apid
, apos
,
approx
, approxeq
, arin
, aring
, ascr
, ast
, asymp
, asympeq
,
atild
, atilde
, aum
, auml
, awconint
, awint
, bNot
, backcong
,
backepsilon
, backprime
, backsim
, backsimeq
, barvee
, barwed
,
barwedge
, bbrk
, bbrktbrk
, bcong
, bcy
, bdquo
, becaus
, because
,
bemptyv
, bepsi
, bernou
, beta
, beth
, between
, bfr
, bigcap
,
bigcirc
, bigcup
, bigodot
, bigoplus
, bigotimes
, bigsqcup
, bigstar
,
bigtriangledown
, bigtriangleup
, biguplus
, bigvee
, bigwedge
, bkarow
,
blacklozenge
, blacksquare
, blacktriangle
, blacktriangledown
,
blacktriangleleft
, blacktriangleright
, blank
, blk12
, blk14
, blk34
,
block
, bne
, bnequiv
, bnot
, bopf
, bot
, bottom
, bowtie
, boxDL
,
boxDR
, boxDl
, boxDr
, boxH
, boxHD
, boxHU
, boxHd
, boxHu
, boxUL
,
boxUR
, boxUl
, boxUr
, boxV
, boxVH
, boxVL
, boxVR
, boxVh
, boxVl
,
boxVr
, boxbox
, boxdL
, boxdR
, boxdl
, boxdr
, boxh
, boxhD
, boxhU
,
boxhd
, boxhu
, boxminus
, boxplus
, boxtimes
, boxuL
, boxuR
, boxul
,
boxur
, boxv
, boxvH
, boxvL
, boxvR
, boxvh
, boxvl
, boxvr
, bprime
,
breve
, brvba
, brvbar
, bscr
, bsemi
, bsim
, bsime
, bsol
, bsolb
,
bsolhsub
, bull
, bullet
, bump
, bumpE
, bumpe
, bumpeq
, cacute
,
cap
, capand
, capbrcup
, capcap
, capcup
, capdot
, caps
, caret
,
caron
, ccaps
, ccaron
, ccedi
, ccedil
, ccirc
, ccups
, ccupssm
,
cdot
, cedi
, cedil
, cemptyv
, cen
, cent
, centerdot
, cfr
, chcy
,
check
, checkmark
, chi
, cir
, cirE
, circ
, circeq
, circlearrowleft
,
circlearrowright
, circledR
, circledS
, circledast
, circledcirc
,
circleddash
, cire
, cirfnint
, cirmid
, cirscir
, clubs
, clubsuit
,
colon
, colone
, coloneq
, comma
, commat
, comp
, compfn
, complement
,
complexes
, cong
, congdot
, conint
, cop
, copf
, coprod
, copy
,
copysr
, crarr
, cross
, cscr
, csub
, csube
, csup
, csupe
, ctdot
,
cudarrl
, cudarrr
, cuepr
, cuesc
, cularr
, cularrp
, cup
, cupbrcap
,
cupcap
, cupcup
, cupdot
, cupor
, cups
, curarr
, curarrm
,
curlyeqprec
, curlyeqsucc
, curlyvee
, curlywedge
, curre
, curren
,
curvearrowleft
, curvearrowright
, cuvee
, cuwed
, cwconint
, cwint
,
cylcty
, dArr
, dHar
, dagger
, daleth
, darr
, dash
, dashv
,
dbkarow
, dblac
, dcaron
, dcy
, dd
, ddagger
, ddarr
, ddotseq
, de
,
deg
, delta
, demptyv
, dfisht
, dfr
, dharl
, dharr
, diam
, diamond
,
diamondsuit
, diams
, die
, digamma
, disin
, div
, divid
, divide
,
divideontimes
, divonx
, djcy
, dlcorn
, dlcrop
, dollar
, dopf
, dot
,
doteq
, doteqdot
, dotminus
, dotplus
, dotsquare
, doublebarwedge
,
downarrow
, downdownarrows
, downharpoonleft
, downharpoonright
, drbkarow
,
drcorn
, drcrop
, dscr
, dscy
, dsol
, dstrok
, dtdot
, dtri
, dtrif
,
duarr
, duhar
, dwangle
, dzcy
, dzigrarr
, eDDot
, eDot
, eacut
,
eacute
, easter
, ecaron
, ecir
, ecir
, ecirc
, ecolon
, ecy
, edot
,
ee
, efDot
, efr
, eg
, egrav
, egrave
, egs
, egsdot
, el
,
elinters
, ell
, els
, elsdot
, emacr
, empty
, emptyset
, emptyv
,
emsp
, emsp13
, emsp14
, eng
, ensp
, eogon
, eopf
, epar
, eparsl
,
eplus
, epsi
, epsilon
, epsiv
, eqcirc
, eqcolon
, eqsim
, eqslantgtr
,
eqslantless
, equals
, equest
, equiv
, equivDD
, eqvparsl
, erDot
,
erarr
, escr
, esdot
, esim
, et
, eta
, eth
, eum
, euml
, euro
,
excl
, exist
, expectation
, exponentiale
, fallingdotseq
, fcy
,
female
, ffilig
, fflig
, ffllig
, ffr
, filig
, fjlig
, flat
, fllig
,
fltns
, fnof
, fopf
, forall
, fork
, forkv
, fpartint
, frac1
,
frac1
, frac12
, frac13
, frac14
, frac15
, frac16
, frac18
, frac23
,
frac25
, frac3
, frac34
, frac35
, frac38
, frac45
, frac56
, frac58
,
frac78
, frasl
, frown
, fscr
, g
, gE
, gEl
, gacute
, gamma
,
gammad
, gap
, gbreve
, gcirc
, gcy
, gdot
, ge
, gel
, geq
, geqq
,
geqslant
, ges
, gescc
, gesdot
, gesdoto
, gesdotol
, gesl
, gesles
,
gfr
, gg
, ggg
, gimel
, gjcy
, gl
, glE
, gla
, glj
, gnE
, gnap
,
gnapprox
, gne
, gneq
, gneqq
, gnsim
, gopf
, grave
, gscr
, gsim
,
gsime
, gsiml
, gt
, gtcc
, gtcir
, gtdot
, gtlPar
, gtquest
,
gtrapprox
, gtrarr
, gtrdot
, gtreqless
, gtreqqless
, gtrless
, gtrsim
,
gvertneqq
, gvnE
, hArr
, hairsp
, half
, hamilt
, hardcy
, harr
,
harrcir
, harrw
, hbar
, hcirc
, hearts
, heartsuit
, hellip
, hercon
,
hfr
, hksearow
, hkswarow
, hoarr
, homtht
, hookleftarrow
,
hookrightarrow
, hopf
, horbar
, hscr
, hslash
, hstrok
, hybull
,
hyphen
, iacut
, iacute
, ic
, icir
, icirc
, icy
, iecy
, iexc
,
iexcl
, iff
, ifr
, igrav
, igrave
, ii
, iiiint
, iiint
, iinfin
,
iiota
, ijlig
, imacr
, image
, imagline
, imagpart
, imath
, imof
,
imped
, in
, incare
, infin
, infintie
, inodot
, int
, intcal
,
integers
, intercal
, intlarhk
, intprod
, iocy
, iogon
, iopf
, iota
,
iprod
, iques
, iquest
, iscr
, isin
, isinE
, isindot
, isins
,
isinsv
, isinv
, it
, itilde
, iukcy
, ium
, iuml
, jcirc
, jcy
,
jfr
, jmath
, jopf
, jscr
, jsercy
, jukcy
, kappa
, kappav
, kcedil
,
kcy
, kfr
, kgreen
, khcy
, kjcy
, kopf
, kscr
, l
, lAarr
, lArr
,
lAtail
, lBarr
, lE
, lEg
, lHar
, lacute
, laemptyv
, lagran
,
lambda
, lang
, langd
, langle
, lap
, laqu
, laquo
, larr
, larrb
,
larrbfs
, larrfs
, larrhk
, larrlp
, larrpl
, larrsim
, larrtl
, lat
,
latail
, late
, lates
, lbarr
, lbbrk
, lbrace
, lbrack
, lbrke
,
lbrksld
, lbrkslu
, lcaron
, lcedil
, lceil
, lcub
, lcy
, ldca
,
ldquo
, ldquor
, ldrdhar
, ldrushar
, ldsh
, le
, leftarrow
,
leftarrowtail
, leftharpoondown
, leftharpoonup
, leftleftarrows
,
leftrightarrow
, leftrightarrows
, leftrightharpoons
, leftrightsquigarrow
,
leftthreetimes
, leg
, leq
, leqq
, leqslant
, les
, lescc
, lesdot
,
lesdoto
, lesdotor
, lesg
, lesges
, lessapprox
, lessdot
, lesseqgtr
,
lesseqqgtr
, lessgtr
, lesssim
, lfisht
, lfloor
, lfr
, lg
, lgE
,
lhard
, lharu
, lharul
, lhblk
, ljcy
, ll
, llarr
, llcorner
,
llhard
, lltri
, lmidot
, lmoust
, lmoustache
, lnE
, lnap
, lnapprox
,
lne
, lneq
, lneqq
, lnsim
, loang
, loarr
, lobrk
, longleftarrow
,
longleftrightarrow
, longmapsto
, longrightarrow
, looparrowleft
,
looparrowright
, lopar
, lopf
, loplus
, lotimes
, lowast
, lowbar
,
loz
, lozenge
, lozf
, lpar
, lparlt
, lrarr
, lrcorner
, lrhar
,
lrhard
, lrm
, lrtri
, lsaquo
, lscr
, lsh
, lsim
, lsime
, lsimg
,
lsqb
, lsquo
, lsquor
, lstrok
, lt
, ltcc
, ltcir
, ltdot
, lthree
,
ltimes
, ltlarr
, ltquest
, ltrPar
, ltri
, ltrie
, ltrif
, lurdshar
,
luruhar
, lvertneqq
, lvnE
, mDDot
, mac
, macr
, male
, malt
,
maltese
, map
, mapsto
, mapstodown
, mapstoleft
, mapstoup
, marker
,
mcomma
, mcy
, mdash
, measuredangle
, mfr
, mho
, micr
, micro
,
mid
, midast
, midcir
, middo
, middot
, minus
, minusb
, minusd
,
minusdu
, mlcp
, mldr
, mnplus
, models
, mopf
, mp
, mscr
, mstpos
,
mu
, multimap
, mumap
, nGg
, nGt
, nGtv
, nLeftarrow
,
nLeftrightarrow
, nLl
, nLt
, nLtv
, nRightarrow
, nVDash
, nVdash
,
nabla
, nacute
, nang
, nap
, napE
, napid
, napos
, napprox
, natur
,
natural
, naturals
, nbs
, nbsp
, nbump
, nbumpe
, ncap
, ncaron
,
ncedil
, ncong
, ncongdot
, ncup
, ncy
, ndash
, ne
, neArr
, nearhk
,
nearr
, nearrow
, nedot
, nequiv
, nesear
, nesim
, nexist
, nexists
,
nfr
, ngE
, nge
, ngeq
, ngeqq
, ngeqslant
, nges
, ngsim
, ngt
,
ngtr
, nhArr
, nharr
, nhpar
, ni
, nis
, nisd
, niv
, njcy
, nlArr
,
nlE
, nlarr
, nldr
, nle
, nleftarrow
, nleftrightarrow
, nleq
,
nleqq
, nleqslant
, nles
, nless
, nlsim
, nlt
, nltri
, nltrie
,
nmid
, no
, nopf
, not
, notin
, notinE
, notindot
, notinva
,
notinvb
, notinvc
, notni
, notniva
, notnivb
, notnivc
, npar
,
nparallel
, nparsl
, npart
, npolint
, npr
, nprcue
, npre
, nprec
,
npreceq
, nrArr
, nrarr
, nrarrc
, nrarrw
, nrightarrow
, nrtri
,
nrtrie
, nsc
, nsccue
, nsce
, nscr
, nshortmid
, nshortparallel
,
nsim
, nsime
, nsimeq
, nsmid
, nspar
, nsqsube
, nsqsupe
, nsub
,
nsubE
, nsube
, nsubset
, nsubseteq
, nsubseteqq
, nsucc
, nsucceq
,
nsup
, nsupE
, nsupe
, nsupset
, nsupseteq
, nsupseteqq
, ntgl
, ntild
,
ntilde
, ntlg
, ntriangleleft
, ntrianglelefteq
, ntriangleright
,
ntrianglerighteq
, nu
, num
, numero
, numsp
, nvDash
, nvHarr
, nvap
,
nvdash
, nvge
, nvgt
, nvinfin
, nvlArr
, nvle
, nvlt
, nvltrie
,
nvrArr
, nvrtrie
, nvsim
, nwArr
, nwarhk
, nwarr
, nwarrow
, nwnear
,
oS
, oacut
, oacute
, oast
, ocir
, ocir
, ocirc
, ocy
, odash
,
odblac
, odiv
, odot
, odsold
, oelig
, ofcir
, ofr
, ogon
, ograv
,
ograve
, ogt
, ohbar
, ohm
, oint
, olarr
, olcir
, olcross
, oline
,
olt
, omacr
, omega
, omicron
, omid
, ominus
, oopf
, opar
, operp
,
oplus
, or
, orarr
, ord
, ord
, ord
, order
, orderof
, ordf
, ordm
,
origof
, oror
, orslope
, orv
, oscr
, oslas
, oslash
, osol
, otild
,
otilde
, otimes
, otimesas
, oum
, ouml
, ovbar
, par
, par
, para
,
parallel
, parsim
, parsl
, part
, pcy
, percnt
, period
, permil
,
perp
, pertenk
, pfr
, phi
, phiv
, phmmat
, phone
, pi
, pitchfork
,
piv
, planck
, planckh
, plankv
, plus
, plusacir
, plusb
, pluscir
,
plusdo
, plusdu
, pluse
, plusm
, plusmn
, plussim
, plustwo
, pm
,
pointint
, popf
, poun
, pound
, pr
, prE
, prap
, prcue
, pre
,
prec
, precapprox
, preccurlyeq
, preceq
, precnapprox
, precneqq
,
precnsim
, precsim
, prime
, primes
, prnE
, prnap
, prnsim
, prod
,
profalar
, profline
, profsurf
, prop
, propto
, prsim
, prurel
, pscr
,
psi
, puncsp
, qfr
, qint
, qopf
, qprime
, qscr
, quaternions
,
quatint
, quest
, questeq
, quo
, quot
, rAarr
, rArr
, rAtail
,
rBarr
, rHar
, race
, racute
, radic
, raemptyv
, rang
, rangd
,
range
, rangle
, raqu
, raquo
, rarr
, rarrap
, rarrb
, rarrbfs
,
rarrc
, rarrfs
, rarrhk
, rarrlp
, rarrpl
, rarrsim
, rarrtl
, rarrw
,
ratail
, ratio
, rationals
, rbarr
, rbbrk
, rbrace
, rbrack
, rbrke
,
rbrksld
, rbrkslu
, rcaron
, rcedil
, rceil
, rcub
, rcy
, rdca
,
rdldhar
, rdquo
, rdquor
, rdsh
, re
, real
, realine
, realpart
,
reals
, rect
, reg
, rfisht
, rfloor
, rfr
, rhard
, rharu
, rharul
,
rho
, rhov
, rightarrow
, rightarrowtail
, rightharpoondown
,
rightharpoonup
, rightleftarrows
, rightleftharpoons
, rightrightarrows
,
rightsquigarrow
, rightthreetimes
, ring
, risingdotseq
, rlarr
, rlhar
,
rlm
, rmoust
, rmoustache
, rnmid
, roang
, roarr
, robrk
, ropar
,
ropf
, roplus
, rotimes
, rpar
, rpargt
, rppolint
, rrarr
, rsaquo
,
rscr
, rsh
, rsqb
, rsquo
, rsquor
, rthree
, rtimes
, rtri
, rtrie
,
rtrif
, rtriltri
, ruluhar
, rx
, sacute
, sbquo
, sc
, scE
, scap
,
scaron
, sccue
, sce
, scedil
, scirc
, scnE
, scnap
, scnsim
,
scpolint
, scsim
, scy
, sdot
, sdotb
, sdote
, seArr
, searhk
,
searr
, searrow
, sec
, sect
, semi
, seswar
, setminus
, setmn
,
sext
, sfr
, sfrown
, sh
, sharp
, shchcy
, shcy
, shortmid
,
shortparallel
, shy
, sigma
, sigmaf
, sigmav
, sim
, simdot
, sime
,
simeq
, simg
, simgE
, siml
, simlE
, simne
, simplus
, simrarr
,
slarr
, smallsetminus
, smashp
, smeparsl
, smid
, smile
, smt
, smte
,
smtes
, softcy
, sol
, solb
, solbar
, sopf
, spades
, spadesuit
,
spar
, sqcap
, sqcaps
, sqcup
, sqcups
, sqsub
, sqsube
, sqsubset
,
sqsubseteq
, sqsup
, sqsupe
, sqsupset
, sqsupseteq
, squ
, square
,
squarf
, squf
, srarr
, sscr
, ssetmn
, ssmile
, sstarf
, star
,
starf
, straightepsilon
, straightphi
, strns
, sub
, subE
, subdot
,
sube
, subedot
, submult
, subnE
, subne
, subplus
, subrarr
, subset
,
subseteq
, subseteqq
, subsetneq
, subsetneqq
, subsim
, subsub
,
subsup
, succ
, succapprox
, succcurlyeq
, succeq
, succnapprox
,
succneqq
, succnsim
, succsim
, sum
, sung
, sup
, sup
, sup
, sup
,
sup1
, sup2
, sup3
, supE
, supdot
, supdsub
, supe
, supedot
,
suphsol
, suphsub
, suplarr
, supmult
, supnE
, supne
, supplus
,
supset
, supseteq
, supseteqq
, supsetneq
, supsetneqq
, supsim
,
supsub
, supsup
, swArr
, swarhk
, swarr
, swarrow
, swnwar
, szli
,
szlig
, target
, tau
, tbrk
, tcaron
, tcedil
, tcy
, tdot
, telrec
,
tfr
, there4
, therefore
, theta
, thetasym
, thetav
, thickapprox
,
thicksim
, thinsp
, thkap
, thksim
, thor
, thorn
, tilde
, time
,
times
, timesb
, timesbar
, timesd
, tint
, toea
, top
, topbot
,
topcir
, topf
, topfork
, tosa
, tprime
, trade
, triangle
,
triangledown
, triangleleft
, trianglelefteq
, triangleq
, triangleright
,
trianglerighteq
, tridot
, trie
, triminus
, triplus
, trisb
, tritime
,
trpezium
, tscr
, tscy
, tshcy
, tstrok
, twixt
, twoheadleftarrow
,
twoheadrightarrow
, uArr
, uHar
, uacut
, uacute
, uarr
, ubrcy
,
ubreve
, ucir
, ucirc
, ucy
, udarr
, udblac
, udhar
, ufisht
, ufr
,
ugrav
, ugrave
, uharl
, uharr
, uhblk
, ulcorn
, ulcorner
, ulcrop
,
ultri
, um
, umacr
, uml
, uogon
, uopf
, uparrow
, updownarrow
,
upharpoonleft
, upharpoonright
, uplus
, upsi
, upsih
, upsilon
,
upuparrows
, urcorn
, urcorner
, urcrop
, uring
, urtri
, uscr
, utdot
,
utilde
, utri
, utrif
, uuarr
, uum
, uuml
, uwangle
, vArr
, vBar
,
vBarv
, vDash
, vangrt
, varepsilon
, varkappa
, varnothing
, varphi
,
varpi
, varpropto
, varr
, varrho
, varsigma
, varsubsetneq
,
varsubsetneqq
, varsupsetneq
, varsupsetneqq
, vartheta
, vartriangleleft
,
vartriangleright
, vcy
, vdash
, vee
, veebar
, veeeq
, vellip
,
verbar
, vert
, vfr
, vltri
, vnsub
, vnsup
, vopf
, vprop
, vrtri
,
vscr
, vsubnE
, vsubne
, vsupnE
, vsupne
, vzigzag
, wcirc
, wedbar
,
wedge
, wedgeq
, weierp
, wfr
, wopf
, wp
, wr
, wreath
, wscr
,
xcap
, xcirc
, xcup
, xdtri
, xfr
, xhArr
, xharr
, xi
, xlArr
,
xlarr
, xmap
, xnis
, xodot
, xopf
, xoplus
, xotime
, xrArr
, xrarr
,
xscr
, xsqcup
, xuplus
, xutri
, xvee
, xwedge
, yacut
, yacute
,
yacy
, ycirc
, ycy
, ye
, yen
, yfr
, yicy
, yopf
, yscr
, yucy
,
yum
, yuml
, zacute
, zcaron
, zcy
, zdot
, zeetrf
, zeta
, zfr
,
zhcy
, zigrarr
, zopf
, zscr
, zwj
, or zwnj
.
Thanks to John Gruber for inventing Markdown.
Thanks to John MacFarlane for defining CommonMark.
Thanks to ZEIT, Inc., Gatsby, Inc., Netlify, Inc., Holloway, Inc., and the many organizations and individuals for financial support through OpenCollective
Copyright © 2019 Titus Wormer. This work is licensed under a Creative Commons Attribution 4.0 International License.