w3c / i18n-checker

W3C's i18n checker
https://validator.w3.org/i18n-checker/
Other
36 stars 17 forks source link

test for control characters #16

Closed r12a closed 8 years ago

r12a commented 8 years ago

https://html.spec.whatwg.org/multipage/dom.html#phrasing-content-2 says

Text nodes and attribute values must consist of Unicode characters, must not contain U+0000 characters, must not contain permanently undefined Unicode characters (noncharacters), and must not contain control characters other than space characters. This specification includes extra constraints on the exact value of Text nodes and attribute values depending on their precise context.

add a check for that

r12a commented 8 years ago

Although we do a lot of work on character encodings, this specific topic is not really i18n related. I already added a test for C0/C1 characters as escapes (mainly because of things like the euro often being escaped as €), and i think that should suffice.