Closed felixbuenemann closed 7 years ago
This fixes a bug in the regular expression for escaping invalid XML characters.
Ruby requires the special syntax \u{hex} for matching unicode character with more than 4 bytes.
\u{hex}
Because of this the regex range [^\u10000-\u10FFFF] was interpreted as [^\u1000]|[^0-\u10FF]|[^FF].
[^\u10000-\u10FFFF]
[^\u1000]|[^0-\u10FF]|[^FF]
I found this bug because RUBYOPT=-W3 triggered a duplicate character range warning for the regex.
RUBYOPT=-W3
This fixes a bug in the regular expression for escaping invalid XML characters.
Ruby requires the special syntax
\u{hex}
for matching unicode character with more than 4 bytes.Because of this the regex range
[^\u10000-\u10FFFF]
was interpreted as[^\u1000]|[^0-\u10FF]|[^FF]
.I found this bug because
RUBYOPT=-W3
triggered a duplicate character range warning for the regex.