Open grahamgower opened 1 year ago
See #9.
@IsabelMarleen I think we must had a similar problem when using the reference parser in the demes-python test suite (but only in the continuous integration when running on Windows). It turned out to be an issue with Python choosing the encoding for stdout to match the OS-configured locale, which on Windows was utf16 by default. The solution was to call python with the -X utf8
option to override the default encoding.
https://github.com/popsim-consortium/demes-python/blob/392c6a0eb5e70223a00d6659df2134317a94bdf0/tests/test_spec.py#L33-L34
I guess you're using a locale on your computer, for which the default encoding is utf16? Could you try adding the -X utf8
option when calling the reference parser here:
https://github.com/RacimoLab/demes-r/blob/f484f43d9d4e194dc11f1e7938d74aa3fd8dcf22/tests/testthat/helper-functions.R#L43
Some discussions about encoding here: https://github.com/popsim-consortium/demes-spec/issues/129
I tried it just now and it did not make a difference. I ran python3 reference_implementation/resolve_yaml.py test-cases/valid/unicode_deme_name_04.yaml -X utf8 > tmp.json
and in the output the property in question looks like "name": "\ud867\ude3d"
. When trying to read tmp.json with yaml::read_yaml()
I get the following error:
Error in yaml.load(string, error.label = error.label, ...) : (tmp.json) Scanner error: while parsing a quoted scalar at line 9, column 15 found invalid Unicode character escape code at line 9, column 18
Without specifying -X utf8
, the yaml parser worked when I specified fileEncoding=UTF-16
, but that throws a different error now. The scanner error is the same I encountered before, however.
What operating system are you using?
I'm using macOS.
The reference parser output for the valid test case
unicode_deme_name_04.yaml
from the demes-spec repo is encoded as utf16, when it should be encoded as utf8.