Open LuisBecker0 opened 2 years ago
Looks like this library still implements YAML 1.1, which is not JSON superset. PyYAML/LibYAML also have exactly the same problem.
If control characters cannot be parsed correctly, can an option be provided for users to replace control characters with \uFFFD
or other characters? This is to ensure that other content can be parsed correctly.
I have tried using replacement before deserialization, but I cannot guarantee that it will cover all unsupported cases by libraries.
func Replace(in string, replacer func(rune) bool, replacement rune) string {
var result bytes.Buffer
for _, r := range in {
if replacer(r) {
result.WriteRune(replacement)
} else {
result.WriteRune(r)
}
}
return result.String()
}
According to chapter 5 of the yaml specification I would expect the library to parse streams with non-C0 control characters inside quoted scalars. I encountered an Issue when parsing a stream containing the C1 control character "\x80". See playground.