Closed jim-minter closed 3 years ago
Yikes, using master (v1.2.5-0.20210320190651-a2bb12368408) if I increase the concurrency of the above test from 2 to 100, I see:
runtime: pointer 0xc0006c0880 to unused region of span span.base()=0xc00085a000 span.limit=0xc00085c000 span.state=1
fatal error: found bad pointer in Go heap (incorrect use of unsafe or cgo?)
I don't seem to see this with v1.2.2, or with master and -tags safe
Hmm, I think I've got myself confused with the precise versions that work and don't. Here is a simplified test with no dependencies; this seems to pass here with v1.2.2 and v1.2.3, but not v1.2.4 or master. When n = 1 (no concurrency), it passes on v1.2.4 and master.
package ugorji
import (
"bytes"
"encoding/base64"
"fmt"
"reflect"
"sync/atomic"
"testing"
"time"
"github.com/ugorji/go/codec"
)
type secureBytes []byte
type secureBytesExt struct{}
func (secureBytesExt) ConvertExt(v interface{}) interface{} {
// encryption code removed
return base64.StdEncoding.EncodeToString(v.(secureBytes))
}
func (secureBytesExt) UpdateExt(dest interface{}, v interface{}) {
// decryption code removed
b, err := base64.StdEncoding.DecodeString(v.(string))
if err != nil {
panic(err)
}
*dest.(*secureBytes) = b
}
var done int32
func testExtensionsConcurrency(i int, h *codec.JsonHandle) error {
want := secureBytes(fmt.Sprintf("%d", i))
for {
var b []byte
e := codec.NewEncoderBytes(&b, h)
err := e.Encode(want)
if err != nil {
return err
}
var got secureBytes
d := codec.NewDecoderBytes(b, h)
err = d.Decode(&got)
if err != nil {
return err
}
if !bytes.Equal(want, got) {
return fmt.Errorf("%d: want: %s, got: %s", i, string(want), string(got))
}
// reading from a closed channel perturbs the scheduling too much and
// masks unsafe errors we're currently seeing in ugorji/go master
if atomic.LoadInt32(&done) != 0 {
break
}
}
return nil
}
func TestExtensionsConcurrency(t *testing.T) {
const n = 100
h := &codec.JsonHandle{
BasicHandle: codec.BasicHandle{
DecodeOptions: codec.DecodeOptions{
ErrorIfNoField: true,
},
},
}
err := h.SetInterfaceExt(reflect.TypeOf(secureBytes{}), 1, secureBytesExt{})
if err != nil {
t.Fatal(err)
}
errch := make(chan error, n)
for i := 0; i < n; i++ {
go func(i int) {
errch <- testExtensionsConcurrency(i, h)
}(i)
}
time.AfterFunc(5*time.Second, func() {
atomic.StoreInt32(&done, 1)
})
for i := 0; i < n; i++ {
err = <-errch
if err != nil {
t.Fatal(err)
}
}
}
Bad bug. Great catch. And thanks for the excellent test case that I could quickly use to reproduce the issue.
Fix coming soon, and I will tag this with need test so I remember to put in a test case.
@jim-minter please let me know that it resolves the problem on your end, and I will cut a new release this weekend.
BTW the race issue you found was fixed by: 76603559030f291b11839a54bae70045f07d583f
Hi @ugorji, using 11d01daad36cb5b406765bc9936bd96e3e18cb31 I am no longer able to trigger the bug, the race detector or the "found bad pointer in Go heap" error, so this lgtm. Thanks for the quick turnaround!
Looking at it, I'm a little wary of the unsafe code now, and I'm wondering whether we might start building with -tags codec.safe
. I think that less performance and more safety is probably a better tradeoff for what we're doing. Regardless of that statement, thanks so much for this project! Among other things, MissingFielder is priceless.
Looking at it, I'm a little wary of the unsafe code now, and I'm wondering whether we might start building with
-tags codec.safe
. I think that less performance and more safety is probably a better tradeoff for what we're doing. Regardless of that statement, thanks so much for this project! Among other things, MissingFielder is priceless.
Treating "transient" values is the only place I know that might interact negatively with the GC. The reason is that we were using the same memory space for things that looked like pointers and things that didn't, leading GC to track an allocated memory space as a pointer, and barf when it saw something else there.
The changes fix this: we now only use transient memory space for numbers/bool and struct/array with no internal pointers, and use string/slice shaped value for the string/slice case.
The main benefit is that it can dramatically reduce allocations, which would have been stack-allocated in non-reflection mode.
My advice: you know your risk tolerance, but please keep using it and help flesh out any issues. I don't anticipate any more, but best to find them all.
Thanks.
@jim-minter FYI I cut v1.2.5 today - see https://github.com/ugorji/go/releases/tag/v1.2.5 and https://pkg.go.dev/github.com/ugorji/go/codec
Hi @ugorji I suspect a race condition which is causing us data corruption with concurrent JSON decodes/encodes of a type implementing codec.InterfaceExt . I think v1.2.3 and v1.2.4 may have the bug; v1.2.2 and master may not.
This is the test case I'm working on at the moment - it is not minimal at this stage, but I hope the data race is clear enough for you to see and understand the issue. The ugorji/go version used in this report is v1.2.4.
Here is the output with v1.2.4:
I think it would be good to consider adding more test coverage to prevent a regression, creating a v1.2.5 release and perhaps marking v1.2.3 and v1.2.4 as bad.