Open autowp opened 7 years ago
"Ugly" is not the problem when security-sensitive contexts. Also, most source viewers will already make these attributes simple to read (Firefox does, for example).
As for the size, gzip compression generally deals with it.
That not easy to understand where is security improvements here.
For example, why "dot" is secure character but "semicolon" is not?
As for the size: On my example cyrillic page where escapeHtmlAttr partially used: 68988 bytes - escaped only quotes and angle brackets 83611 bytes - escaped by escapeHtmlAttr (+20%)
Same with gzip 11116 bytes 11790 bytes (+6%)
Indeed, the size is not crucial.
Are you asking to add more characters to the whitelist, so they don't get encoded?
Maybe you could argue that certain characters like ":" don't need to be escaped, but it's easier to have a very small white-list of "known good" characters ([^a-z0-9,\.\-_]
), than trying to work out which characters are allowed in each context.
For anyone not familiar with the background... the reason escapeHtmlAttr()
encodes more aggressively than escapeHtml()
is for non-quoted attributes.
Lets say someone did:
$url = 'https://www.example.com/';
<a href=<?= $escaper->escapeHtmlAttr($url) ?>>
Notice that it does not include quote marks.
This creates the fairly "ugly" output:
<a href=https://www.example.com/>
What happens if $url
was provided by the user (maybe a link to their website), and they set it to:
$url = 'https://www.example.com/ onclick=do_evil_thing';
Without using escapeHtmlAttr()
, it would create the perfectly valid:
<a href=https://www.example.com/ onclick=do_evil_thing>
This means they can create an onclick event handler on your website :-)
You could still use escapeHtml()
or htmlspecialchars()
, but you must make sure your attributes are quoted.
<a href="<?= $escaper->escapeHtml($url) ?>">
So that it creates:
<a href="https://www.example.com/">
Or, if you want to use htmlspecialchars()
, don't forget to use it in full:
htmlspecialchars($url, ENT_QUOTES | ENT_SUBSTITUTE, 'utf-8')
PS: Have a look at adding a CSP (Content Security Policy), and set it so that it does not allow unsafe-inline
for scripts or styles. This will probably require you to make some changes, but it adds a second line of defence against this problem, where any attributes like onclick
would be blocked by the browser.
@craigfrancis Thanks for your explanation! I think, this could improve the documentation.
This repository has been closed and moved to laminas/laminas-escaper; a new issue has been opened at https://github.com/laminas/laminas-escaper/issues/3.
Which requires escaping a large number of characters in attributes?
[^a-z0-9,\.\-_]
URL's in html looks ugly and are larger than possible