Open GoogleCodeExporter opened 9 years ago
Parsing a UTF-8 encoded file/stream with non-ASCII characters does not parse correctly. For instance, if the file/stream contains the word "café" encoded in UTF-8, the parser outputs "café" instead. The reason is because TSuperObject.ParseStream() only looks for a UTF-16 BOM and assumes anything else is ANSI. It does not look for a UTF-8 BOM, or allow the caller to specify the source encoding (such as with the SysUtils.TEncoding class in Delphi/C++Builder 2009 and later) when a BOM is not present but the source encoding is otherwise known to the caller.
Original issue reported on code.google.com by gambi...@gmail.com on 26 Aug 2014 at 2:10
gambi...@gmail.com
Original issue reported on code.google.com by
gambi...@gmail.com
on 26 Aug 2014 at 2:10