janelia-flyem / dvid

Distributed, Versioned, Image-oriented Dataservice
http://dvid.io
Other
196 stars 33 forks source link

Segmentation faults when building from source #249

Closed rivo closed 6 years ago

rivo commented 6 years ago

I'm trying to build DVID from source (latest commit), following the instructions found in the user guide. Unfortunately, I'm getting segmentation violation panics when I'm trying to load files into the (LevelDB) backend (using the "load" command).

I've included a stack trace below. They look slightly different sometimes but it generally happens when the lz4 package makes a CGO call. I'm on "Ubuntu 16.04.4 LTS".

Any ideas on how to make this work? Maybe there's an extra step that I should follow that's not in the guide?

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0xbacc43]

runtime stack:
runtime.throw(0xdd959a, 0x2a)
        /usr/local/go/src/runtime/panic.go:619 +0x81
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_unix.go:372 +0x28e

goroutine 6002 [syscall]:
runtime.cgocall(0xbbac00, 0xc4b38a7cb0, 0xc42008a400)
        /usr/local/go/src/runtime/cgocall.go:128 +0x64 fp=0xc4b38a7c70 sp=0xc4b38a7c38 pc=0x4065c4
github.com/janelia-flyem/go/golz4._Cfunc_LZ4_compress_limitedOutput(0xc443c56000, 0xc553196004, 0x809000008000, 0x0)
        _cgo_gotypes.go:62 +0x4d fp=0xc4b38a7cb0 sp=0xc4b38a7c70 pc=0x77a3cd
github.com/janelia-flyem/go/golz4.Compress(0xc443c56000, 0x8000, 0x8000, 0xc553196004, 0x8090, 0x8090, 0xc420340028, 0xc4b38a7da8, 0x988057)
        /root/go/src/github.com/janelia-flyem/go/golz4/lz4.go:50 +0x57 fp=0xc4b38a7d00 sp=0xc4b38a7cb0 pc=0x77a707
github.com/janelia-flyem/dvid/dvid.SerializeData(0xc443c56000, 0x8000, 0x8000, 0xff04, 0xdbdc84, 0xc, 0x1, 0xcf9580, 0xc552faa7e0)
        /root/go/src/github.com/janelia-flyem/dvid/dvid/serialize.go:206 +0x540 fp=0xc4b38a7db8 sp=0xc4b38a7d00 pc=0x79cdf0
github.com/janelia-flyem/dvid/datatype/imageblk.(*Data).writeBlocks.func1(0xc4202f87e0, 0xc4202fb400, 0xc4202fc680, 0xc511eeebe0, 0xc511eeebe8, 0x7ff4c2f7b138, 0xc420304580, 0xc42000e100, 0xc420592000, 0x900a, ...)
        /root/go/src/github.com/janelia-flyem/dvid/datatype/imageblk/write.go:600 +0x1f2 fp=0xc4b38a7f60 sp=0xc4b38a7db8 pc=0xa4edb2
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc4b38a7f68 sp=0xc4b38a7f60 pc=0x45d0f1
created by github.com/janelia-flyem/dvid/datatype/imageblk.(*Data).writeBlocks
        /root/go/src/github.com/janelia-flyem/dvid/datatype/imageblk/write.go:589 +0x29a
DocSavage commented 6 years ago

This is an outstanding issue https://github.com/janelia-flyem/dvid/issues/163 where the current lz4 library compiled into dvid is not compatible with newer Ubuntu for some reason. The solution is simply to update the lz4 library, yet we have not done it for our production versions due to (1) not running those OS versions, and (2) not being sure of impact of new lz4 code on previously stored lz4 data. I'll prioritize the upgrade since this is a sticking point for multiple people.

DocSavage commented 6 years ago

To clarify, the dependent lz4 library is https://github.com/cloudflare/golz4 that has not been updated in 3 years. I'll investigate moving to a different maintained library.

rivo commented 6 years ago

Thanks. Spent all day chasing this down and came to the same conclusion. (I did not know about #163.) I actually updated the lz4 library but the seg fault still happened. There is an issue cloudflare/golz4#8 and the related pull request cloudflare/golz4#9 where they seem to have the same issue. I patched the code with the branch from the pull request and now it seems to work. Not sure why they haven't merged it into the master branch. (Maybe nobody at Cloudflare feels responsible for this.)

DocSavage commented 6 years ago

As a workaround, I believe if you specify different compression when creating the data instance you are loading images into. Since this requires a rewrite of code wherever lz4 is used, not sure when the change will land. For de novo use of dvid, I might move to a different fast codec if lz4 isn't supported as well.

DocSavage commented 6 years ago

Note that this was fixed in commit 7380190 just now.

rivo commented 6 years ago

I'm getting this now:

dvid/serialize.go:21:2: cannot find package "github.com/pierrec/lz4" in any of:
        /usr/local/go/src/github.com/pierrec/lz4 (from $GOROOT)
        /root/go/src/github.com/pierrec/lz4 (from $GOPATH)

I guess it's missing from the get-go-dependencies.sh script?

DocSavage commented 6 years ago

This was added to the CMake file. After upgrading the HEAD on master, you have to re-run the "cmake" command before doing "make dvid". That should pull the necessary library automatically.

DocSavage commented 6 years ago

Sorry, previous comment is incorrect because @stuarteberg moved build system to conda so you are correct, there needs to be github.com/pierrec/lz4 in the get-go-dependencies.sh script. We are going to modify that file shortly.

DocSavage commented 6 years ago

Master branch has reverted to older CGo lz4 due to issues. See issue #251.

DocSavage commented 6 years ago

This issue has been solved by updating the go lz4 library with new lz4 source code. See closed issue #251