cxflag203 / superobject

Automatically exported from code.google.com/p/superobject
0 stars 0 forks source link

ParseFile() and ParseStream() do not support UTF-8 #59

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Parsing a UTF-8 encoded file/stream with non-ASCII characters does not parse 
correctly.  For instance, if the file/stream contains the word "café" encoded 
in UTF-8, the parser outputs "café" instead.  The reason is because 
TSuperObject.ParseStream() only looks for a UTF-16 BOM and assumes anything 
else is ANSI.  It does not look for a UTF-8 BOM, or allow the caller to specify 
the source encoding (such as with the SysUtils.TEncoding class in 
Delphi/C++Builder 2009 and later) when a BOM is not present but the source 
encoding is otherwise known to the caller.

Original issue reported on code.google.com by gambi...@gmail.com on 26 Aug 2014 at 2:10