batchu / owasp-esapi-java

Automatically exported from code.google.com/p/owasp-esapi-java
0 stars 0 forks source link

CSSCodec.decodeCharacter [trunk] returns the character after the slash in CSS escape sequences that produce a character with an invalid codepoint. #109

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. CSSCodec.decode("\\abcdefg");

Expected: "\abcdefg" (the unaltered input)
Actual: "ag"

This concerns CSSCodec from the current revision of trunk.  My suggestion
for a fix is attached as csscodec_decchar.patch and csscodectest.patch adds
a test case.

This issue would also affect the 1.4 branch if detection for invalid code
points were added. Using the previous example, the decoded character would
be returned (along with the "g") in this branch because
Character.isValidCodePoint() is not possible in Java 1.4
I suggest at least ignoring escape sequences that result in a character
with ordinal number greater that 0x10FFFF and I've attached a patch which
implements this as well as fixes the aforementioned issue.

Original issue reported on code.google.com by jahboite@gmail.com on 19 Feb 2010 at 1:02

Attachments:

GoogleCodeExporter commented 8 years ago
Thanks for the report and the patches. Sorry it took so long to look at it.

The CSS decoder seems to have several issues including the one you mentioned. It
looks like it won't remove white space after a hex escape as well.

Looking at the spec (http://www.w3.org/TR/CSS21/syndata.html#characters) it 
suggests
invalid unicode characters should be replaced by the replacement character 
U+FFFD.
How would that be instead?

Original comment by schal...@darkmist.net on 2 May 2010 at 5:37

GoogleCodeExporter commented 8 years ago
Ok. I did the replacement char method, some other fixes and test cases. It's 
commited
as revision 1387. Give it a try if you would.

Original comment by schal...@darkmist.net on 2 May 2010 at 6:47