coreos / torus

Torus Distributed Storage
https://coreos.com/blog/torus-distributed-storage-by-coreos.html
Apache License 2.0
1.78k stars 172 forks source link

WriteBlock() accessed invalid memory address to copy data #343

Open nak3 opened 8 years ago

nak3 commented 8 years ago

Description

unexpected fault address 0x7f4fb9854000
fatal error: fault
[signal 0x7 code=0x2 addr=0x7f4fb9854000 pc=0x463d11]

goroutine 63169 [running]:
runtime.throw(0xd168e0, 0x5)
    /usr/lib/golang/src/runtime/panic.go:547 +0x90 fp=0xc82033d3b0 sp=0xc82033d398
runtime.sigpanic()
    /usr/lib/golang/src/runtime/sigpanic_unix.go:21 +0x10c fp=0xc82033d400 sp=0xc82033d3b0
runtime.memmove(0x7f4fb9800000, 0xc8206e6000, 0x80000)
    /usr/lib/golang/src/runtime/memmove_amd64.s:83 +0x91 fp=0xc82033d408 sp=0xc82033d400
github.com/coreos/torus/storage.(*MFile).WriteBlock(0xc82027cc30, 0x3530, 0xc8206e6000, 0x80000, 0x80000, 0x0, 0x0)
    /root/work/src/github.com/coreos/torus/storage/mmap_file.go:87 +0x1d4 fp=0xc82033d510 sp=0xc82033d408
github.com/coreos/torus/storage.(*mfileBlock).WriteBlock(0xc8202686c0, 0x7fcb19643648, 0xc823fb2ab0, 0x10000000003, 0xc18, 0x2, 0xc8206e6000, 0x80000, 0x80000, 0x0, ...)
    /root/work/src/github.com/coreos/torus/storage/mfile.go:226 +0x4f4 fp=0xc82033d7a0 sp=0xc82033d510
github.com/coreos/torus/distributor.(*Distributor).PutBlock(0xc8202687e0, 0x7fcb19643648, 0xc823fb2ab0, 0x10000000003, 0xc18, 0x2, 0xc8206e6000, 0x80000, 0x80000, 0x0, ...)
    /root/work/src/github.com/coreos/torus/distributor/rpc.go:42 +0x35d fp=0xc82033d908 sp=0xc82033d7a0
github.com/coreos/torus/distributor/protocols/grpc.(*handler).PutBlock(0xc820314040, 0x7fcb19643648, 0xc823fb2ab0, 0xc82077e4b0, 0xc821e2e000, 0x0, 0x0)
    /root/work/src/github.com/coreos/torus/distributor/protocols/grpc/grpc.go:123 +0x188 fp=0xc82033da00 sp=0xc82033d908
github.com/coreos/torus/models._TorusStorage_PutBlock_Handler(0xc5ff20, 0xc820314040, 0x7fcb19643648, 0xc823fb2ab0, 0xc820775f40, 0x0, 0x0, 0x0, 0x0, 0x0)
    /root/work/src/github.com/coreos/torus/models/rpc.pb.go:652 +0x168 fp=0xc82033daa8 sp=0xc82033da00
github.com/coreos/torus/vendor/google.golang.org/grpc.(*Server).processUnaryRPC(0xc820268870, 0x7fcb19650480, 0xc820076000, 0xc8224a6000, 0xc820314100, 0x117a1b8, 0xc823fb2a50, 0x0, 0x0)
    /root/work/src/github.com/coreos/torus/vendor/google.golang.org/grpc/server.go:524 +0xe24 fp=0xc82033ddf0 sp=0xc82033daa8
github.com/coreos/torus/vendor/google.golang.org/grpc.(*Server).handleStream(0xc820268870, 0x7fcb19650480, 0xc820076000, 0xc8224a6000, 0xc823fb2a50)
    /root/work/src/github.com/coreos/torus/vendor/google.golang.org/grpc/server.go:684 +0x109d fp=0xc8203 sp=0xc82033ddf0
github.com/coreos/torus/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc820248b50, 0xc820268870, 0x7fcb19650480, 0xc820076000, 0xc8224a6000)
    /root/work/src/github.com/coreos/torus/vendor/google.golang.org/grpc/server.go:350 +0xa0 fp=0xc82033df78 sp=0xc82033df48
runtime.goexit()
    /usr/lib/golang/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc82033df80 sp=0xc82033df78
created by github.com/coreos/torus/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
    /root/work/src/github.com/coreos/torus/vendor/google.golang.org/grpc/server.go:351 +0x9a
nak3 commented 8 years ago

I realized that this issue is apparently caused when the data size of storage node exceeded. I think that torus storage node needs some stopper for the data size overflow.

barakmich commented 7 years ago

That's a heck of a panic; it's within memmove. You're probably right on how to reproduce it though. Could it be related to some of your other fixes regarding size of the mmap'd file?