AndersDJohnson / htmlcompressor

Automatically exported from code.google.com/p/htmlcompressor
Apache License 2.0
1 stars 0 forks source link

UTF8 Encoding problem #49

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Created html containing string "Français"
2. Run HTML compressor with or without charset UTF-8 option

What is the expected output? What do you see instead?
"Français" is stored as "Français"

What version of the product are you using? On what operating system?
htmlcompressor-1.4.2.jar

Please provide any additional information below.
using JAVA 1.5 update 15 on Windows XP SP3

Original issue reported on code.google.com by badd...@gmail.com on 2 Aug 2011 at 10:56

GoogleCodeExporter commented 8 years ago
Sorry I can't reproduce it. Could you please create a file with this word and 
attach it here.

Original comment by serg472@gmail.com on 2 Aug 2011 at 11:55

GoogleCodeExporter commented 8 years ago
Hi,
  Attached sample and its compressed version.
 When I run this on my personal computer the "Français" is stored as "Fran�ais"
Command: C:\Installable\compressor>java -jar htmlcompressor-1.4.2.jar --charset 
UTF-8 -o C:\Installable\compressor\sample_compressed.html sample.html

Original comment by badd...@gmail.com on 3 Aug 2011 at 3:16

Attachments:

GoogleCodeExporter commented 8 years ago
Your original file is not in UTF encoding, it is in ISO-8859-1. You can check 
this by opening it in a browser and switching encoding to UTF - that character 
will be broken.

Just compress your files with --charset ISO-8859-1 parameter.

Original comment by serg472@gmail.com on 3 Aug 2011 at 11:51