apache / arrow-go

Official Go implementation of Apache Arrow
https://arrow.apache.org/
Apache License 2.0
38 stars 6 forks source link

[Go] C Data Interface implementation violates cgo rules by default #70

Open lidavidm opened 1 year ago

lidavidm commented 1 year ago

Describe the bug, including details regarding any error messages, version, and platform.

cgo requires that C memory cannot contain persistent pointers to Go memory:

Go code may not store a Go pointer in C memory. C code may store Go pointers in C memory, subject to the rule above: it must stop storing the Go pointer when the C function returns.

But we do exactly this in the C Data Interface:

https://github.com/apache/arrow/blob/4f1d255f3dc57457e5c78d98c4b76fc523cf9961/go/arrow/cdata/cdata_exports.go#L392

We want to do this because we do not want to copy data. However, technically, we can't. And if we enable the runtime checks for this via GODEBUG=cgocheck=2, we indeed get a crash:

``` write of Go pointer 0xc0004a8000 to non-Go memory 0x55cdba7ebd18 fatal error: Go pointer stored into non-Go memory runtime stack: runtime.throw({0x7f6c404fc97f?, 0xc0000044e0?}) /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/panic.go:992 +0x71 runtime.cgoCheckWriteBarrier.func1() /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/cgocheck.go:55 +0x8a runtime.systemstack() /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/asm_amd64.s:469 +0x46 goroutine 17 [running, locked to thread]: runtime.systemstack_switch() /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/asm_amd64.s:436 fp=0xc0001eb8c8 sp=0xc0001eb8c0 pc=0x7f6c40676d40 runtime.cgoCheckWriteBarrier(0x55cdba7ebd18, 0xc0004a8000) /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/cgocheck.go:53 +0xe8 fp=0xc0001eb8f8 sp=0xc0001eb8c8 pc=0x7f6c4061a7c8 runtime.wbBufFlush(0xc0001eb950?, 0x7f6c40619433?) /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/mwbbuf.go:190 +0x2e fp=0xc0001eb918 sp=0xc0001eb8f8 pc=0x7f6c40644aee runtime.wbBufFlush(0x55cdba7ebd18, 0xc0004a8000) :1 +0x2c fp=0xc0001eb938 sp=0xc0001eb918 pc=0x7f6c4067b46c runtime.gcWriteBarrier() /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/asm_amd64.s:1669 +0xa3 fp=0xc0001eb9b8 sp=0xc0001eb938 pc=0x7f6c406790c3 runtime.gcWriteBarrierR8() /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/asm_amd64.s:1729 +0x7 fp=0xc0001eb9c0 sp=0xc0001eb9b8 pc=0x7f6c406791a7 github.com/apache/arrow/go/v11/arrow/cdata.exportArray({0x7f6c40e79bd8?, 0xc0001d44c0?}, 0x55cdb9f4dd20, 0xc0001d45c0?) /home/lidavidm/go/pkg/mod/github.com/apache/arrow/go/v11@v11.0.0-20230127143402-4f1d255f3dc5/arrow/cdata/cdata_exports.go:392 +0xb05 fp=0xc0001ebb30 sp=0xc0001eb9c0 pc=0x7f6c40d04365 github.com/apache/arrow/go/v11/arrow/cdata.exportArray({0x7f6c40e7a188?, 0xc0001d4480?}, 0x7ffdebc06300, 0xc0000d6aa0?) /home/lidavidm/go/pkg/mod/github.com/apache/arrow/go/v11@v11.0.0-20230127143402-4f1d255f3dc5/arrow/cdata/cdata_exports.go:421 +0x9d2 fp=0xc0001ebca0 sp=0xc0001ebb30 pc=0x7f6c40d04232 github.com/apache/arrow/go/v11/arrow/cdata.ExportArrowRecordBatch({0x7f6c40e79390, 0xc0003210e0}, 0xc00019c5b0?, 0x0) /home/lidavidm/go/pkg/mod/github.com/apache/arrow/go/v11@v11.0.0-20230127143402-4f1d255f3dc5/arrow/cdata/interface.go:217 +0x2a5 fp=0xc0001ebd88 sp=0xc0001ebca0 pc=0x7f6c40cf9505 github.com/apache/arrow/go/v11/arrow/cdata.cRecordReader.next({{0x7f6c40e76328?, 0xc0004007e0?}, 0x0?}, 0x7ffdebc06688?) /home/lidavidm/go/pkg/mod/github.com/apache/arrow/go/v11@v11.0.0-20230127143402-4f1d255f3dc5/arrow/cdata/cdata_exports.go:459 +0x65 fp=0xc0001ebdd0 sp=0xc0001ebd88 pc=0x7f6c40d04565 github.com/apache/arrow/go/v11/arrow/cdata.streamGetNext(0x0?, 0xc000004601?) /home/lidavidm/go/pkg/mod/github.com/apache/arrow/go/v11@v11.0.0-20230127143402-4f1d255f3dc5/arrow/cdata/exports.go:136 +0x6d fp=0xc0001ebe10 sp=0xc0001ebdd0 pc=0x7f6c40d0582d _cgoexp_c84825904f89_streamGetNext(0x7ffdebc062c0) _cgo_gotypes.go:379 +0x28 fp=0xc0001ebe30 sp=0xc0001ebe10 pc=0x7f6c40d06a08 runtime.cgocallbackg1(0x7f6c40d069e0, 0xc0001ebfe0?, 0x0) /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/cgocall.go:314 +0x2c2 fp=0xc0001ebf00 sp=0xc0001ebe30 pc=0x7f6c40619902 runtime.cgocallbackg(0x0?, 0x0?, 0x0?) /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/cgocall.go:233 +0x109 fp=0xc0001ebf90 sp=0xc0001ebf00 pc=0x7f6c40619589 runtime.cgocallbackg(0x7f6c40d069e0, 0x7ffdebc062c0, 0x0) :1 +0x31 fp=0xc0001ebfb8 sp=0xc0001ebf90 pc=0x7f6c4067b351 runtime.cgocallback(0x0, 0x0, 0x0) /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/asm_amd64.s:971 +0xb3 fp=0xc0001ebfe0 sp=0xc0001ebfb8 pc=0x7f6c40678d93 runtime.goexit() /nix/store/5xayx09nj7rxwkf1c2pxzdslp4ilrv5v-go-1.18.5/share/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0001ebfe8 sp=0xc0001ebfe0 pc=0x7f6c40678fe1 ```

Now, in practice, this is OK, because separately, we keep the Go memory alive by storing the ArrayData in the C Data Interface private_data via a cgo.Handle. (Well, assuming Go's GC is non-moving.) But technically, this is wrong.

What are some solutions?

Both of these require you to remember to do the right thing; if you forget to use the right allocator, you'll still be in violation of the rules. So on top of that, we could add a field to Buffer that indicates whether the buffer is safe to export. Then the cgo-based allocators can set this flag, and during exporting, we can error (or possibly copy) if the buffer is not safe for export.

Component(s)

Go

lidavidm commented 1 year ago

@zeroshade for your consideration

zeroshade commented 1 year ago

Another possible solution might be to leverage jemalloc or https://pkg.go.dev/github.com/dgraph-io/ristretto/z#Allocator as a way to amortize some of the FFI cost by allocating larger chunks at a time, or something to that effect.

zeroshade commented 1 year ago

That said, because the allocator is defined as an interface, anyone could easily create their own allocator if they prefer and we can have the simple Malloc-based one you created for now and if someone wants a better allocator can they create/contribute one.

That said, I agree with adding a flag for buffers to denote them as safe to export but that would require modifying the Allocator interface so we can know whether or not to set that flag.

lidavidm commented 1 year ago

Another possible solution might be to leverage jemalloc or https://pkg.go.dev/github.com/dgraph-io/ristretto/z#Allocator as a way to amortize some of the FFI cost by allocating larger chunks at a time, or something to that effect.

Just implement malloc in go :) Allocate a big chunk via C.malloc, then slice it up and hand out buffers as needed (allocating more chunks as needed)

lidavidm commented 1 year ago

At least in Java, Netty's allocator (which we use in Arrow) pools allocations based on size classes to avoid having to allocate new buffers

zeroshade commented 1 year ago

Lol, I mean, yes I'd prefer to avoid implementing malloc in go, hence my suggestion of us just leveraging jemalloc or something directly haha. Or some other way to minimize the FFI. That said, I'm fine with just a simple malloc-based allocator for now and if anyone needs better performance we can re-assess or they can use the allocator interface and build their own and then (hopefully) contribute it.

lidavidm commented 1 year ago

Fair enough!