go-yaml / yaml

YAML support for the Go language.
Other
6.86k stars 1.05k forks source link

Unexpected "control characters are not allowed" #872

Open LuisBecker0 opened 2 years ago

LuisBecker0 commented 2 years ago

According to chapter 5 of the yaml specification I would expect the library to parse streams with non-C0 control characters inside quoted scalars. I encountered an Issue when parsing a stream containing the C1 control character "\x80". See playground.

fenollp commented 2 years ago

See https://github.com/go-yaml/yaml/issues/737

WGH- commented 2 years ago

Looks like this library still implements YAML 1.1, which is not JSON superset. PyYAML/LibYAML also have exactly the same problem.

gitsang commented 9 months ago

If control characters cannot be parsed correctly, can an option be provided for users to replace control characters with \uFFFD or other characters? This is to ensure that other content can be parsed correctly.

I have tried using replacement before deserialization, but I cannot guarantee that it will cover all unsupported cases by libraries.

func Replace(in string, replacer func(rune) bool, replacement rune) string {
        var result bytes.Buffer
        for _, r := range in {
                if replacer(r) {
                        result.WriteRune(replacement)
                } else {
                        result.WriteRune(r)
                }
        }
        return result.String()
}