This can result in an XSS that will not be possible in a standard-compliant
parser: In the current implementation, gumbo_tag_from_original_text will return
script on the unknown element script\v (or script\r).
Serializers relaying on gumbo_tag_from_original_text (such as prettyprint)
will transform non-executable <script\v> tags to executable <script> tags.
I had a PR ready with fix + tests but due to legal reason I can't sign the CLA. Let me know if it's OK for someone else to merge it and I'll link to the diff.
To fix this the isspace in gumbo_tag_from_original_text should be replaced with the exact list the spec details, and a test case for <script\v> etc. parsing should be added.
gumbo_tag_from_original_text
currently usesisspace
to detect illegal whitespaces in tag names.isspace
will match on\v
and\r
, which are not illegal according to the spec (https://html.spec.whatwg.org/multipage/syntax.html#tag-name-state).This can result in an XSS that will not be possible in a standard-compliant parser: In the current implementation,
gumbo_tag_from_original_text
will returnscript
on the unknown elementscript\v
(orscript\r
).Serializers relaying on
gumbo_tag_from_original_text
(such asprettyprint
) will transform non-executable<script\v>
tags to executable<script>
tags.I had a PR ready with fix + tests but due to legal reason I can't sign the CLA. Let me know if it's OK for someone else to merge it and I'll link to the diff.
To fix this the
isspace
ingumbo_tag_from_original_text
should be replaced with the exact list the spec details, and a test case for<script\v>
etc. parsing should be added.