ochinchina / supervisord

a go-lang supervisor implementation
MIT License
4.03k stars 569 forks source link

MacOS build version panic if kill program #272

Open Colstuwjx opened 3 years ago

Colstuwjx commented 3 years ago

Hi, I'm trying to kill a program which is managed by supervisord, and it panic at waitForExit:

INFO[2021-06-08T18:02:33+08:00] program stopped with status:signal: killed    program=supervisor_listener_test
fatal error: sync: Unlock of unlocked RWMutex
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x14a9047]

goroutine 39 [running]:
runtime.throw(0x15e97e7, 0x20)
    /usr/local/go/src/runtime/panic.go:1117 +0x72 fp=0xc0001a8e80 sp=0xc0001a8e50 pc=0x1038372
sync.throw(0x15e97e7, 0x20)
    /usr/local/go/src/runtime/panic.go:1103 +0x35 fp=0xc0001a8ea0 sp=0xc0001a8e80 pc=0x1069935
sync.(*RWMutex).Unlock(0xc0002e0068)
    /usr/local/go/src/sync/rwmutex.go:142 +0xc6 fp=0xc0001a8ed8 sp=0xc0001a8ea0 pc=0x1079ce6
runtime.call16(0x0, 0x16142f8, 0xc000310228, 0x800000008)
    /usr/local/go/src/runtime/asm_amd64.s:550 +0x3e fp=0xc0001a8ef8 sp=0xc0001a8ed8 pc=0x106cebe
runtime.reflectcallSave(0xc0001a9030, 0x16142f8, 0xc000310228, 0xc000000008)
    /usr/local/go/src/runtime/panic.go:877 +0x58 fp=0xc0001a8f28 sp=0xc0001a8ef8 pc=0x1037938
runtime.runOpenDeferFrame(0xc0002a6780, 0xc0003101e0, 0x0)
    /usr/local/go/src/runtime/panic.go:851 +0x62d fp=0xc0001a8fb0 sp=0xc0001a8f28 pc=0x103760d
panic(0x15467c0, 0x19887b0)
    /usr/local/go/src/runtime/panic.go:965 +0x1b9 fp=0xc0001a9078 sp=0xc0001a8fb0 pc=0x1037bb9
runtime.panicmem()
    /usr/local/go/src/runtime/panic.go:212 +0x5b fp=0xc0001a9098 sp=0xc0001a9078 pc=0x1035fbb
runtime.sigpanic()
    /usr/local/go/src/runtime/signal_unix.go:734 +0x179 fp=0xc0001a90d0 sp=0xc0001a9098 pc=0x104ef19
github.com/ochinchina/supervisord/process.(*Process).waitForExit(0xc0002e0000, 0x1)
    /Users/jacky/Colstuwjx/Projects/OpenSource/github.com/ochinchina/supervisord/process/process.go:495 +0x267 fp=0xc0001a93e0 sp=0xc0001a90d0 pc=0x14a9047
github.com/ochinchina/supervisord/process.(*Process).run(0xc0002e0000, 0xc00028e438)
    /Users/jacky/Colstuwjx/Projects/OpenSource/github.com/ochinchina/supervisord/process/process.go:599 +0x8f1 fp=0xc0001a9ce0 sp=0xc0001a93e0 pc=0x14aa011
github.com/ochinchina/supervisord/process.(*Process).Start.func1(0xc0002e0000, 0x0, 0x0)
    /Users/jacky/Colstuwjx/Projects/OpenSource/github.com/ochinchina/supervisord/process/process.go:162 +0x8e fp=0xc0001a9fc8 sp=0xc0001a9ce0 pc=0x14b148e
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:1371 +0x1 fp=0xc0001a9fd0 sp=0xc0001a9fc8 pc=0x106e901
created by github.com/ochinchina/supervisord/process.(*Process).Start
    /Users/jacky/Colstuwjx/Projects/OpenSource/github.com/ochinchina/supervisord/process/process.go:159 +0x38d

go version: go version go1.16 darwin/amd64 code version: cb079d1

Colstuwjx commented 3 years ago

I've been found the fail point, but I'm still wondering about the reason:

  1. process would exit if it's been killed, and it would reach out process.goL493
func (p *Process) waitForExit(startSecs int64) {
        ...
       defer p.lock.Unlock()
        ...
}
  1. at the same time, process run() suddenly called defer unlock at process.goL526, and it caused panic:
func (p *Process) run(finishCb func()) {
        ...
        defer p.lock.Unlock()
        ...
}

How to reproduce: it's easily reproduced by running a supervisord, and kill a managed program, it would panic at that moment...

Colstuwjx commented 3 years ago

I think the root cause was that we didn't set the program logger if it's eventlistener type, and https://github.com/ochinchina/supervisord/pull/273/commits/827585c8442aa5d32b5d72a7d488f7f8cc3f9eb0 is my fix.