ugorji / go

idiomatic codec and rpc lib for msgpack, cbor, json, etc. msgpack.org[Go]
MIT License
1.85k stars 295 forks source link

Freeze when encoding unicode replacement character #317

Closed Albibek closed 4 years ago

Albibek commented 5 years ago

Hello.

Probably found json encoding bug for strings containing unicode replacement character.

package main

import (
    "fmt"
    "unicode/utf8"

    "github.com/ugorji/go/codec"
)

func main() {
    var err error

    // Unicode REPLACEMENT CHARACTER U+FFFD
    a := []byte{0xef, 0xbf, 0xbd}
    as := string(a)

    m := map[string]interface{}{
        "somekey1": as,
    }

    // is valid rune
    fmt.Println(utf8.Valid(a))
    // decodes OK
    fmt.Println(utf8.DecodeRune(a))

    jhandle := &codec.JsonHandle{}

    buf := make([]byte, 0, 1024)
    jenc := codec.NewEncoderBytes(&buf, jhandle)
    // 100% CPU forever
    err = jenc.Encode(&m)
    if err != nil {
        panic(err)
    }

    // never go here
}
Albibek commented 5 years ago

I think problem is here

https://github.com/ugorji/go/blob/42bc974514ff101a54c6b72a0e4dee29d96c0b26/codec/json.go#L414-L423

It should check if size == 3 that means rune is already RuneError, but that it was parsed correctly,

70l571y commented 4 years ago

We suggest this solution:


if c == utf8.RuneError {
    if size == 3 { // 1 -> 3 or get rid of this condition because len(utf8.RuneError) is always 3
        if start < i {
            w.writestr(s[start:i])
        }
        w.writestr(`\ufffd`)
        i += uint(size) // i++ -> i += uint(size)
        start = i
        continue // fix endless loop
    }
    //continue //causes endless loop if size != 3
}

Its worked for us well