UcasRichard / snakeyaml

Automatically exported from code.google.com/p/snakeyaml
Apache License 2.0
0 stars 0 forks source link

Some Unicode characters are wrongly read with load method #68

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. yaml.load file with character И: test.string: {en: И}
2.
3.

What is the expected output? What do you see instead?
File must be loaded without problem, but i get exception "unacceptable 
character #FFFD  special characters are not allowed"

What version of the product are you using? On what operating system?
1.6

Please provide any additional information below.
Не читает он "И" большое.

Original issue reported on code.google.com by olegsm...@gmail.com on 21 Jun 2010 at 1:32

GoogleCodeExporter commented 9 years ago
SnakeYAML has a number of tests to cover Unicode characters. 
Anyway I have added one more test for this issue. It works properly. You can 
see it here:
http://code.google.com/p/snakeyaml/source/browse/src/test/java/org/yaml/snakeyam
l/issues/issue68/NonAsciiCharacterTest.java

I think the problem is the encoding. YAML requires the file to be in UTF-8 or 
in UTF-16. I am afraid you use Windows and Cp1252 character encoding. Since you 
do not have BOM in the file SnakeYAMl assumes it is UTF-8 which is wrong.
If you still think it is a bug in SnakeYAML can you please submit a patch with 
a test case ?

Original comment by aso...@gmail.com on 21 Jun 2010 at 3:14

GoogleCodeExporter commented 9 years ago
You right. I change file encoding to UTF-8 in java and SnakeYAML reads file 
perfectly. Intersing that earlier file was in ANSI encoding, java in CP1252 
encoding and only character И make exception.
Issue is solved, sorry for inconvenience and thank you for perfect YAML parser.

Original comment by olegsm...@gmail.com on 23 Jun 2010 at 8:54

GoogleCodeExporter commented 9 years ago
If you remove "И" the file might be parsed without errors but you get 
different characters. 

Original comment by aso...@gmail.com on 23 Jun 2010 at 9:35