Closed hipstermojo closed 4 years ago
I'll just go ahead and answer my own question. The HTML5 spec has no restrictions on what to use for naming ids, however the HTML4 spec does. So I guess the CSS selector seems to use the older standard.
https://drafts.csswg.org/selectors-3/#id-selectors uses https://www.w3.org/TR/CSS21/syndata.html#value-def-identifier, which says "they cannot start with a digit, two hyphens, or a hyphen followed by a digit".
The syntax or restrictions on HTML id
attributes are entirely separate from those of CSS ID selectors.
The former can have HTML escape sequences, so the ID value for <p id=""">
is one double-quote character.
The latter relies is #
followed by a CSS identifier, which indeed cannot start by an ASCII digit. However it is possible to write a CSS identifier that represents a value that starts with a digit, by escaping it.
CSS escape sequences are a backslash followed by either the character to be escaped, or by a sequence of hexadecimal digits representing the Unicode code point (followed by an optional space, to separate following digits that are not meant to be part of the escape sequence). To resolve ambiguities, hex digits can only be escaped as their code point value.
TL;DR: to select <p id="1">
, use select_first("p#\31")
since U+0031 is the ASCII digit 1.
More details: https://mathiasbynens.be/notes/css-escapes
Oh wow! Thanks for clearing this up for me. While it's probably an edge case scenario, I'll be sure to keep this in mind.
It seems that Kuchiki will return an
Err
when callingselect_first
if the id begins with a number. For example if the html has something like this,This would be accessed by calling:
However this will just return an
Err
. Is this a bug in the way a CSS selector is parsed or is it that the CSS spec requires ids to be named starting with an alphabetic character?