Closed petdance closed 5 years ago
There is an ambiguous ampersand section in HTML5 spec.
An ambiguous ampersand is a U+0026 AMPERSAND character (&) that is followed by one or more alphanumeric ASCII characters, followed by a ";" (U+003B) character, where these characters do not match any of the names given in the named character references section.
But there is no matching named character &d;
, &di;
, ..., &display
, etc. HTML 4.01 is also similar with HTML5: https://mathiasbynens.be/notes/ambiguous-ampersands.
I'll try tidy-html5
against this repository. As it generates some warnings, I'll consider replacing single &
to &
to be explicit.
This repository uses html5validator which is based on The Nu Html Checker (v.Nu), it generates errors with following HTML:
<link href="https://fonts.googleapis.com/css?family=Bitter:400,400i,700©" rel="stylesheet">
$ html5validator _site/2017/01/02/my-example-post/index.html
ERROR:html5validator.validator:"file:/Users/yous/src/whiteglass/_site/2017/01/02/my-example-post/index.html":40.1-40.72: error: The string following "&" was interpreted as a character reference. ("&" probably should have been escaped as "&".)
As a named character reference ©
exists. But with the original URL, html5validator doesn't generate errors.
Character entities have to have a semicolon at the end.
If the ©
in the URL is supposed to be the copyright symbol entity, then it should have the semicolon at the end, as ©
. And if it's not supposed to be the copyright symbol entity, then it should be &copy
.
Yes, right. I meant, if there was an ambiguous ampersand in URL, then html5validator would give some errors. But the actual URL doesn't contain ambiguous ampersand as there is no &dis;
, &disp;
, etc.
I tried some snippets:
<a title="©=foo">link</a>
, html5validator doesn't give errors, the hover text is ©=foo
.<a title="©=foo">link</a>
, html5validator doesn't give errors, the hover text is ©=foo
.<a title="©">link</a>
, html5validator gives an error, the hover text is ©
.<a title="©foo">link</a>
, html5validator doesn't give errors, the hover text is ©foo
.So when the entity is not explicitly a character entity but will be parsed as a character entity, html5validator will give errors.
I think using html5validator is enough, are there something more to consider or am I missing something?
I understand that &display=swap
is not ambiguous, and that browsers may handle it OK. Still, the correct way to do it is with &display=swap
. I'm not seeing a reason not to.
This instance in fonts.html
is the only instance of this problem that I've seen.
Okay. There are not so many ampersands, merging now.
My mistake, it needs to be HTML-encoded, not URL-encoded.
Yes, there is a problem with the unencoded ampersand. Consider:
The
&
in the URL has to be encoded as&
just like any other ampersand in the HTML document.Note that this change does not change the URL. It's simply encoding it correctly.