windbreakerdoss / owasp-esapi-java

Automatically exported from code.google.com/p/owasp-esapi-java
Other
0 stars 0 forks source link

HTMLEntityCodec#decode incorrectly decodes upper-case accented letters as their lower-case counterparts #296

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. System.out.println(new 
org.owasp.esapi.codecs.HTMLEntityCodec().decode("Á"));

What is the expected output? What do you see instead?

I would expect the HTMLEntityCodec to correctly decode "Á" as a capital 
"Á". Instead, it outputs a lower-case "á". The same is true for all HTML 
entities whose encoding fits the "&*acute;" pattern.

What version of the product are you using? On what operating system?

Version 2.0.1 on MacOS X 10.8.2.

Does this issue affect only a specified browser or set of browsers?

Nope, this is an API issue.

Please provide any additional information below.

Checking out your source code from trunk (25/3/2013), it seems the problem is 
line 253 of HTMLEntityCodec.java (in method getNamedEntity): 

    possible.append(Character.toLowerCase(input.next())); 

Here it is turning everything into lower case as it reads the input stream, 
thereby losing the case information for accented letters.

Original issue reported on code.google.com by bja...@twigkit.com on 25 Mar 2013 at 4:05