changcheng / wro4j

Automatically exported from code.google.com/p/wro4j
0 stars 0 forks source link

Wrong encoding for IOUtils.toString for non-Unix platforms #265

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Preconditions:
1. Windows platform is used.
2. One source Javascript file is present containing cyrillic text (e.g. 
"текст") and having UTF-8 encoding.

What steps will reproduce the problem?
1. Execute maven with goal wro4j:run on the attached to this issue files.

What is the expected output? What do you see instead?
Expected result: Minimized all.js ouput file is generated. All the cyrillic 
texts are not corrupted.
Actual result:  Minimized all.js ouput file is generated. All the cyrillic 
texts cannot be read because of wrong encoding.

What version of the product are you using? On what operating system?
wro4j: 1.3.8
OS: Windows XP 2002 Service Pack 2
Java: 1.6.0_23
Maven: 3.0.3

Please provide any additional information below.
In the file GoogleClosureCompressorProcessor.java there IOUtils.toString method 
is used (rev. a6a1c1db5b90, line 78) to get the contents of the Reader as a 
String. This method uses default character encoding of the platform if no 
specific encoding is provided. For Linux default platform encoding is "UTF8", 
Windows uses "Cp1252". As the result all the cyrillic texts in the output files 
are just corrupted. Using JAVA_TOOL_OPTIONS environment variable with value 
"-Dfile.encoding=UTF8" does not solve the problem.

Possibly InputStream can be used instead of using Reader, IOUtils has two 
methods toString that support encoding parameter (one for byte[], another for 
InputStream).

Original issue reported on code.google.com by maximfil...@gmail.com on 1 Aug 2011 at 6:23

Attachments:

GoogleCodeExporter commented 9 years ago
The issue should be fixed in branch 1.4.x on github. Could you try and confirm 
that?
Thanks!

Original comment by alex.obj...@gmail.com on 1 Aug 2011 at 9:02

GoogleCodeExporter commented 9 years ago
Also, when you work with Reader, you shouldn't worry about encoding anymore. 
The only place you should take care about encoding is when transforming 
InputStream into String.

Original comment by alex.obj...@gmail.com on 1 Aug 2011 at 9:04

GoogleCodeExporter commented 9 years ago
Unfortunately I cannot see this issue fixed on branch 1.4.x on github.
The ouput file still has "ANSI" encoding while the input one has "UTF-8" and 
the cyrillic text in output file still cannot be read properly (text editor 
shows question marks instead). Please find example of output file in the 
attachment.

On the other hand if just replace the line 78 in 
GoogleClosureCompressorProcessor.java:
final String content = IOUtils.toString(reader);
with simple reading using FileReader class for example, then the output files 
are created with proper encoding:

String content = "";
FileReader fr = null;
try {
    fr = new FileReader("c:\\in.js");
    BufferedReader br = new BufferedReader(fr);
    String string = "";
    while ((string = br.readLine()) != null) {
        content += string + "\r\n";
    }
} catch (Exception e) {
    System.err.println("Error: " + e.getMessage());
} finally {
    fr.close();
}

Original comment by maximfil...@gmail.com on 2 Aug 2011 at 6:22

Attachments:

GoogleCodeExporter commented 9 years ago
Ok, I'll take a look and will try to reproduce the problem. 

Btw Maxim, using a forked version for proving a bug or a gist might help a lot 
(since you already a github user). :)

Original comment by alex.obj...@gmail.com on 2 Aug 2011 at 6:34

GoogleCodeExporter commented 9 years ago
Maxim, I just want to be sure you have tested the proper 'fixed' version. Did 
you checkout branch 1.4.x from here: https://github.com/alexo/wro4j/tree/1.4.x ?
After checking it out, have you tried to install it and use in your example 
maven plugin with version 1.4.0-SNAPSHOT? 

I'm keep asking this, because I've managed to process a resource containing 
russian text and the output was as expected. I'll push to github the changes 
proving that. Therefore, need some support from your side to sort this out.

Thanks!

Original comment by alex.obj...@gmail.com on 2 Aug 2011 at 9:11

GoogleCodeExporter commented 9 years ago
I have created a new test branch just for this issue, called "encodingIssue". 
https://github.com/alexo/wro4j/blob/encodingIssue/wro4j-examples/pom.xml

Check out the project, install it and run the mvn wro4j:run in wro4j-examples 
project. Let me know if you still see the encoding problem.

Original comment by alex.obj...@gmail.com on 2 Aug 2011 at 9:40

GoogleCodeExporter commented 9 years ago
Hi Alex,

Thanks for your fixing this issue!

> Did you checkout branch 1.4.x from here: 
https://github.com/alexo/wro4j/tree/1.4.x?

Yes, I made checkout from this branch. I even checked the version of branch 
with the "git branch" command, it said I cloned the right version - "1.4.x". I 
have just double checked if the issue can be reproduced for this branch and 
yes, the issue is still there.

So, I have also verified the issue on the "encodingIssue" branch and the issue 
is fixed for this version of plugin! The output Javascript file is created 
properly. The only thing I had to do is to remove 
TestGoogleClosureCompressorProcessor.java test because it fails each time the 
plugin is being built (mvn clean install). I have attached the resulting log 
file containing the ouput produced by maven.

Thanks,
Maksim

Original comment by maximfil...@gmail.com on 3 Aug 2011 at 8:27

Attachments:

GoogleCodeExporter commented 9 years ago
Great, so I close this issue with duplicate status (since there are other two 
related issues)

Original comment by alex.obj...@gmail.com on 3 Aug 2011 at 9:09

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Maxim, can you confirm that it is ok with release 1.4.0?

Thanks

Original comment by alex.obj...@gmail.com on 1 Sep 2011 at 11:28

GoogleCodeExporter commented 9 years ago
Hi Alex,

The issue cannot be reproduced on 1.4.1 (downloaded 
alexo-wro4j-v1.4.1-53-gaa67c0a.zip).

Thanks,
Maksim

Original comment by maximfil...@gmail.com on 4 Nov 2011 at 5:42

GoogleCodeExporter commented 9 years ago
Hi Maksim, thanks for confirming that. 

Original comment by alex.obj...@gmail.com on 4 Nov 2011 at 11:08

GoogleCodeExporter commented 9 years ago
I have tried 1.4.2, it still have this problem.

Original comment by xbaof...@gmail.com on 14 Dec 2011 at 7:09

GoogleCodeExporter commented 9 years ago
Hi xbaofeng,
Can you describe your test-case?

Original comment by alex.obj...@gmail.com on 14 Dec 2011 at 8:40