golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.85k stars 17.65k forks source link

runtime: checkdead fires due to suspected race in the Go runtime when GOMAXPROCS=1 on AWS #59600

Closed lcoffe-botify closed 1 year ago

lcoffe-botify commented 1 year ago

What version of Go are you using (go version)?

1.20.3

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

This happens on m3.medium AWS instances. I could not reproduce it anywhere else so far. This is a monothreaded amd64 instance

What did you do?

What did you expect to see?

No crash

What did you see instead?

A crash with "runtime: checkdead: found g XX in status 1". This crash happens on average after a few hours, apparently correlated with the network activity - the more network activity (ie, the more goroutines waiting on network), the more often it will happen.

runtime: checkdead: find g 34876 in status 1
fatal error: checkdead: runnable g

runtime stack:
runtime.throw({0x105b8e6?, 0×300?})
#011/us/local/go/src/runtime/panic.go:1047 +0×5d fp=0xc000a55d58 sp=0xc000a55d28 pc=0x43711d
runtime.checkdead.func1(0xc0009fba00)
#011/usr/local/go/src/runtime/proc.go:5216 +0xe5 fp=0xc000a55d90 sp=0xc000a55d58 pc=0x444dc5
runtime.forEachG(0xc000a55dfo)
#011/usr/local/go/sr/runtime/proc.go:591 +0×5c fp=0xc000a55dc® sp=0xc000a55d90 pc=0×43a8bc
runtime.checkdead()
#011/usr/local/go/src/runtime/proc.go:5203 +0xf7 fp=0xc000a5510 sp=0xc000a55dc0 pc=0x444ab7
runtime.mput(0xc000a55e40?)
#011/usr/local/go/src/runtime/proc.go:5696 +0×33 fp=0x000a5520 sp=0xc000a55e10 pc=0x446033
runtime.stopm()
#011/usr/local/go/src/runtime/proc.go:2332 +0×67 fp=0xc000a5550 sp=0xc000a55e20 pc=0×43dce7 
runtime.findRunnable()
#011/usr/local/go/src/runtime/proc.go:3007 +0xabc fp=0xc000a55f58 sp=0x000a55e50 pc=0x43f55c
runtime.schedule()
#011/usr/local/go/src/runtime/proc.go: 3360 +0xb1 fp=0xc000a55f90 sp=0xc000a55f58 pc=0×440391
runtime.park_m(0x000611380?)
#011/usr/local/go/src/runtime/proc.go:3511 +0×12d fp=0xc000a55fco sp=oxc000a55f90 pc=0×4408ad
runtime.mcall()
#011/usr/local/go/src/runtime/asm_amd64.s:452+0×43 fp=0xc000a55fdo sp=0xc000a55fco pc=0×469f23

Additional details

We've encountered a race condition in the Go runtime that leads to a panic in the checkdead() function when running on a monothreaded instance (GOMAXPROCS=1). The issue seems related to a specific scenario in the findRunnable() function where a goroutine is readied between the findRunnable() check and stopm() call on line 3007 acquiring the scheduler lock, causing an inconsistent scheduler state.

Recap:

The race condition appears to be related to the "delicate dance" code in findRunnable(), which tries to handle a race between the network poller readying a goroutine and the M being stopped. However, this race condition can still occur in rare cases, leading to a panic in checkdead().

To investigate this issue further, we tried to create a minimal reproducer, but we could not trigger it consistently due to the race condition's rarity. Nonetheless, runtime stack traces from the checkdead() panics in the real-world scenario clearly show that it originates from the specific stopm() call in findRunnable(), indicating a strong correlation between the race condition and the panic.

A temporary fix patch has been applied to the checkdead() function to detect and handle the inconsistent state caused by the race condition without panicking immediately. Instead, it collects information and prints debug logs about the state of goroutines and the runtime stack. It only panics when the inconsistent state occurs too many times in a row or for an extended period of time. All occurrences so far were already resolved (ie an M was active) on the next call to checkdead().

This is ofc not a viable solution as it does not fix the root cause of the issue and may introduce other problems.

--- 1.20.3-orig/src/runtime/proc.go
+++ 1.20.3-botigo/src/runtime/proc.go
@@ -5154,6 +5154,9 @@
    }
    unlock(&sched.lock)
 }
+
+var inconsistentStateCount uint8
+var inconsistentStateTime int64

 // Check for deadlock situation.
 // The check is based on number of running M's, if 0 -> deadlock.
@@ -5199,6 +5202,7 @@
        throw("checkdead: inconsistent counts")
    }

+   inconsistentState := false
    grunning := 0
    forEachG(func(gp *g) {
        if isSystemGoroutine(gp, false) {
@@ -5212,10 +5216,76 @@
        case _Grunnable,
            _Grunning,
            _Gsyscall:
-           print("runtime: checkdead: find g ", gp.goid, " in status ", s, "\n")
-           throw("checkdead: runnable g")
+
+           // no M is available - but at least one G is runnable/running or in a syscall; this is an inconsistent state
+           inconsistentState = true
+
+           // Although, if the G is runnable, this might just be a race condition as it was readied by an event right
+           // before the sched lock was acquired.
+           // If one of the G is running or in a syscall, then it's a real issue
+           if s&^_Gscan != _Grunnable {
+               print("runtime: checkdead: find g ", gp.goid, " in status ", s, "\n")
+               throw("checkdead: runnable g")
+           }
+
+           print("[checkdead] find g ", gp.goid, " in status ", s,
+               " - mcount=", mcount(), " sched.nmidle=", sched.nmidle, " sched.nmidlelocked=", sched.nmidlelocked,
+               " sched.nmsys=", sched.nmsys, " sched.nmspinning=", sched.nmspinning.Load(), "\n")
        }
    })
+
+   // if an inconsistent state due to a race condition was detected, then print some debug info
+   // panic if it we get too many occurences in a row, or if we've been stuck too much time without reaching this limit
+   if inconsistentState {
+       // increase the count of inconsistent state occurence
+       inconsistentStateCount++
+
+       // if we get inconsistent state too many times in a row, then panic
+       if inconsistentStateCount == 10 {
+           throw("runtime: checkdead: too many successive occurences of inconsistent state")
+       }
+
+       // if this is the first occurence of inconsistent state, save time
+       // else check for time stuck
+       if inconsistentStateCount == 1 {
+           inconsistentStateTime = nanotime()
+       } else if inconsistentStateTime < nanotime()-5e9 { // 5s
+           throw("runtime: checkdead: too much time stuck in successive occurences of inconsistent state")
+       }
+
+       // print the runtime stacktrace
+       {
+           print("[checkdead] runtime stack\n")
+           pc := getcallerpc()
+           sp := getcallersp()
+           gp := getg()
+           traceback(pc, sp, 0, gp)
+       }
+
+       // print G's state, omitting dead ones, with stacktraces for runnable ones
+       print("[checkdead] goroutines\n")
+       forEachG(func(gp *g) {
+           if isSystemGoroutine(gp, false) {
+               return
+           }
+
+           s := gp.atomicstatus.Load()
+
+           if s&^_Gscan == _Gdead {
+               return
+           }
+
+           goroutineheader(gp)
+           if s&^_Gscan == _Grunnable {
+               traceback(^uintptr(0), ^uintptr(0), 0, gp)
+           }
+       })
+   } else if inconsistentStateCount > 0 {
+       // things got back to normal, clear counters
+       inconsistentStateCount = 0
+       inconsistentStateTime = 0
+   }
+
    if grunning == 0 { // possible if main goroutine calls runtime·Goexit()
        unlock(&sched.lock) // unlock so that GODEBUG=scheddetail=1 doesn't hang
        fatal("no goroutines (main called runtime.Goexit) - deadlock!")
lcoffe-botify commented 1 year ago

Not sure it is valuable information, but, as can be seen in the runtime stacktrace, the source of the schedule() call is mcall(park_m), which is done by gopark(). It is always the case.

prattmic commented 1 year ago

Thanks for the report. Do you have crash output with that extra logging added to checkdead?

lcoffe-botify commented 1 year ago

hey @prattmic, thank you for looking into it. I have been considering if it is possible that the netpoll part in findRunnable is just pushing readied G's to the global run queue because it couldn't find an idle P (with injectglist), then it goes straight to stopm, and up the stack checkdead finds them with no active M.

Currently working on obtaining a core dump. Here is an example of output with the patched runtime

  1. in the forEachG loop

    [checkdead] find g 47 in status 1 - mcount=9 sched.nmidle=8 sched.nmidlelocked=0 sched.nmsys=1 sched.nmspinning=0
    [checkdead] find g 54 in status 1 - mcount=9 sched.nmidle=8 sched.nmidlelocked=0 sched.nmsys=1 sched.nmspinning=0
    [checkdead] find g 61 in status 1 - mcount=9 sched.nmidle=8 sched.nmidlelocked=0 sched.nmsys=1 sched.nmspinning=0
  2. runtime stack (line numbers are off, patched version)

    [checkdead] runtime stack
    runtime.mput(0xc000bc1e40?)
    #011/usr/local/go/src/runtime/proc.go:5766 +0x33
    runtime.stopm()
    #011/usr/local/go/src/runtime/proc.go:2332 +0x67
    runtime.findRunnable()
    #011/usr/local/go/src/runtime/proc.go:3007 +0xabc
    runtime.schedule()
    #011/usr/local/go/src/runtime/proc.go:3360 +0xb1
    runtime.park_m(0xc000007a00?)
    #011/usr/local/go/src/runtime/proc.go:3511 +0x12d
    runtime.mcall()
    #011/usr/local/go/src/runtime/asm_amd64.s:452 +0x43
  3. listing of the goroutines - some goroutine bloat was added in the hope to trigger the issue faster - interesting to see that the three runnable G's here were in netpoll

    [checkdead] goroutines
    goroutine 1 [IO wait]:
    goroutine 6 [select]:
    goroutine 15 [select]:
    goroutine 47 [runnable]:
    internal/poll.runtime_pollWait(0x7f794a149868, 0x72)
    #011/usr/local/go/src/runtime/netpoll.go:306 +0x89
    internal/poll.(*pollDesc).wait(0xc00038b800?, 0xc0008ca000?, 0x0)
    #011/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
    internal/poll.(*pollDesc).waitRead(...)
    #011/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
    internal/poll.(*FD).Read(0xc00038b800, {0xc0008ca000, 0x8000, 0x8000})
    #011/usr/local/go/src/internal/poll/fd_unix.go:167 +0x299
    net.(*netFD).Read(0xc00038b800, {0xc0008ca000?, 0x1060100000000?, 0x8?})
    #011/usr/local/go/src/net/fd_posix.go:55 +0x29
    net.(*conn).Read(0xc0000124e0, {0xc0008ca000?, 0x4063c0?, 0xc00130d6e0?})
    #011/usr/local/go/src/net/net.go:183 +0x45
    bufio.(*Reader).Read(0xc0003fe240, {0xc000869b60, 0x9, 0x7f79714cf108?})
    #011/usr/local/go/src/bufio/bufio.go:237 +0x1bb
    io.ReadAtLeast({0x1373580, 0xc0003fe240}, {0xc000869b60, 0x9, 0x9}, 0x9)
    #011/usr/local/go/src/io/io.go:332 +0x9a
    io.ReadFull(...)
    #011/usr/local/go/src/io/io.go:351
    golang.org/x/net/http2.readFrameHeader({0xc000869b60?, 0x9?, 0xbcd94fcc29a?}, {0x1373580?, 0xc0003fe240?})
    #011/Users/luciencoffe/go/pkg/mod/golang.org/x/net@v0.8.0/http2/frame.go:237 +0x6e
    golang.org/x/net/http2.(*Framer).ReadFrame(0xc000869b20)
    #011/Users/luciencoffe/go/pkg/mod/golang.org/x/net@v0.8.0/http2/frame.go:498 +0x95
    google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc0000a0b40, 0x0?)
    #011/Users/luciencoffe/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/http2_client.go:1592 +0x257
    created by google.golang.org/grpc/internal/transport.newHTTP2Client
    #011/Users/luciencoffe/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/http2_client.go:394 +0x1e2a
    goroutine 4473675 [select]:
    goroutine 4473942 [select]:
    goroutine 50 [select, 216 minutes]:
    goroutine 49 [select]:
    goroutine 43 [select, 216 minutes]:
    goroutine 48 [select]:
    goroutine 56 [select]:
    goroutine 57 [select, 216 minutes]:
    goroutine 54 [runnable]:
    internal/poll.runtime_pollWait(0x7f794a149688, 0x72)
    #011/usr/local/go/src/runtime/netpoll.go:306 +0x89
    internal/poll.(*pollDesc).wait(0xc00038b980?, 0xc0001a0000?, 0x0)
    #011/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
    internal/poll.(*pollDesc).waitRead(...)
    #011/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
    internal/poll.(*FD).Read(0xc00038b980, {0xc0001a0000, 0x8000, 0x8000})
    #011/usr/local/go/src/internal/poll/fd_unix.go:167 +0x299
    net.(*netFD).Read(0xc00038b980, {0xc0001a0000?, 0xc00130d8c0?, 0xc0008f1cd8?})
    #011/usr/local/go/src/net/fd_posix.go:55 +0x29
    net.(*conn).Read(0xc000012570, {0xc0001a0000?, 0x4a919a?, 0xc0003fe780?})
    #011/usr/local/go/src/net/net.go:183 +0x45
    bufio.(*Reader).Read(0xc0003fe780, {0xc0005d02ec, 0x1d14, 0x9?})
    #011/usr/local/go/src/bufio/bufio.go:237 +0x1bb
    io.ReadAtLeast({0x1373580, 0xc0003fe780}, {0xc0005ce000, 0x4000, 0x4000}, 0x4000)
    #011/usr/local/go/src/io/io.go:332 +0x9a
    io.ReadFull(...)
    #011/usr/local/go/src/io/io.go:351
    golang.org/x/net/http2.(*Framer).ReadFrame(0xc000869c00)
    #011/Users/luciencoffe/go/pkg/mod/golang.org/x/net@v0.8.0/http2/frame.go:506 +0x125
    google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc0000a0d80, 0x0?)
    #011/Users/luciencoffe/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/http2_client.go:1592 +0x257
    created by google.golang.org/grpc/internal/transport.newHTTP2Client
    #011/Users/luciencoffe/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/http2_client.go:394 +0x1e2a
    goroutine 55 [select]:
    goroutine 63 [select]:
    goroutine 61 [runnable]:
    internal/poll.runtime_pollWait(0x7f794a149598, 0x72)
    #011/usr/local/go/src/runtime/netpoll.go:306 +0x89
    internal/poll.(*pollDesc).wait(0xc00038bb00?, 0xc0001c0000?, 0x0)
    #011/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32
    internal/poll.(*pollDesc).waitRead(...)
    #011/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
    internal/poll.(*FD).Read(0xc00038bb00, {0xc0001c0000, 0x8000, 0x8000})
    #011/usr/local/go/src/internal/poll/fd_unix.go:167 +0x299
    net.(*netFD).Read(0xc00038bb00, {0xc0001c0000?, 0x1060100000000?, 0x8?})
    #011/usr/local/go/src/net/fd_posix.go:55 +0x29
    net.(*conn).Read(0xc0000125c0, {0xc0001c0000?, 0x4063c0?, 0xc00266c540?})
    #011/usr/local/go/src/net/net.go:183 +0x45
    bufio.(*Reader).Read(0xc0003fecc0, {0xc000869d20, 0x9, 0x7f79714cf108?})
    #011/usr/local/go/src/bufio/bufio.go:237 +0x1bb
    io.ReadAtLeast({0x1373580, 0xc0003fecc0}, {0xc000869d20, 0x9, 0x9}, 0x9)
    #011/usr/local/go/src/io/io.go:332 +0x9a
    io.ReadFull(...)
    #011/usr/local/go/src/io/io.go:351
    golang.org/x/net/http2.readFrameHeader({0xc000869d20?, 0x9?, 0xbcd95094c0c?}, {0x1373580?, 0xc0003fecc0?})
    #011/Users/luciencoffe/go/pkg/mod/golang.org/x/net@v0.8.0/http2/frame.go:237 +0x6e
    golang.org/x/net/http2.(*Framer).ReadFrame(0xc000869ce0)
    #011/Users/luciencoffe/go/pkg/mod/golang.org/x/net@v0.8.0/http2/frame.go:498 +0x95
    google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc0000a0fc0, 0x0?)
    #011/Users/luciencoffe/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/http2_client.go:1592 +0x257
    created by google.golang.org/grpc/internal/transport.newHTTP2Client
    #011/Users/luciencoffe/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/http2_client.go:394 +0x1e2a
    goroutine 62 [select]:
    goroutine 65 [IO wait, 216 minutes]:
    goroutine 66 [select]:
    goroutine 67 [select]:
    goroutine 4473857 [select]:
    goroutine 4473552 [chan receive]:
    goroutine 4473423 [IO wait]:
    goroutine 4473770 [chan receive]:
    goroutine 4473611 [select]:
    goroutine 4473640 [chan receive]:
    goroutine 4473612 [IO wait]:
    goroutine 4473830 [select]:
    goroutine 4473639 [chan receive]:
    goroutine 4473745 [chan receive]:
    goroutine 4473561 [chan receive]:
    goroutine 4473753 [select]:
    goroutine 4473529 [chan receive]:
    goroutine 4473694 [IO wait]:
    goroutine 4473697 [select]:
    goroutine 4473759 [IO wait]:
    goroutine 4473906 [select]:
    goroutine 4473719 [select]:
    goroutine 4473892 [select]:
    goroutine 4473560 [chan receive]:
    goroutine 4473856 [IO wait]:
    goroutine 4473784 [select]:
    goroutine 4473908 [select]:
    goroutine 4473819 [select]:
    goroutine 4473417 [IO wait]:
    goroutine 4473747 [select]:
    goroutine 4473778 [select]:
    goroutine 4459635 [chan receive]:
    goroutine 4473689 [IO wait]:
    goroutine 4473774 [chan receive]:
    goroutine 4473601 [select]:
    goroutine 4473806 [select]:
    goroutine 4473420 [IO wait]:
    goroutine 4473741 [chan receive]:
    goroutine 4473699 [IO wait]:
    goroutine 4473839 [select]:
    goroutine 4473912 [select]:
    goroutine 4473803 [IO wait]:
    goroutine 4473430 [IO wait]:
    goroutine 4473418 [IO wait]:
    goroutine 4473837 [select]:
    goroutine 4473426 [IO wait]:
    goroutine 4473577 [select]:
    goroutine 4473744 [chan receive]:
    goroutine 4473645 [chan receive]:
    goroutine 4473615 [select]:
    goroutine 4473865 [select]:
    goroutine 4473840 [IO wait]:
    goroutine 4451270 [select]:
    goroutine 4473727 [chan receive]:
    goroutine 4473595 [select]:
    goroutine 4473556 [chan receive]:
    goroutine 4473516 [chan receive]:
    goroutine 4473682 [select]:
    goroutine 4473848 [IO wait]:
    goroutine 4473740 [chan receive]:
    goroutine 4473427 [IO wait]:
    goroutine 4451268 [select]:
    goroutine 4473820 [IO wait]:
    goroutine 4473799 [IO wait]:
    goroutine 4473583 [select]:
    goroutine 4473695 [select]:
    goroutine 4473754 [IO wait]:
    goroutine 4473651 [select]:
    goroutine 4473794 [select]:
    goroutine 4473553 [chan receive]:
    goroutine 4473528 [chan receive]:
    goroutine 4473910 [select]:
    goroutine 4473900 [select]:
    goroutine 4473643 [chan receive]:
    goroutine 4473847 [select]:
    goroutine 4473584 [IO wait]:
    goroutine 4473581 [select]:
    goroutine 4473913 [IO wait]:
    goroutine 4473419 [IO wait]:
    goroutine 4473654 [IO wait]:
    goroutine 4473428 [IO wait]:
    goroutine 4473700 [select]:
    goroutine 4473821 [select]:
    goroutine 4473435 [IO wait]:
    goroutine 4473730 [chan receive]:
    goroutine 4473726 [chan receive]:
    goroutine 4473532 [chan receive]:
    goroutine 4473713 [select]:
    goroutine 4473619 [select]:
    goroutine 4473816 [IO wait]:
    goroutine 4473563 [chan receive]:
    goroutine 4459733 [select]:
    goroutine 4473702 [select]:
    goroutine 4473849 [select]:
    goroutine 4473513 [chan receive]:
    goroutine 4473928 [IO wait]:
    goroutine 4473888 [select]:
    goroutine 4473568 [IO wait]:
    goroutine 4473655 [select]:
    goroutine 4473672 [select]:
    goroutine 4473859 [select]:
    goroutine 4473668 [select]:
    goroutine 4473914 [select]:
    goroutine 4473680 [select]:
    goroutine 4473766 [chan receive]:
    goroutine 4473931 [select]:
    goroutine 4473863 [select]:
    goroutine 4473706 [IO wait]:
    goroutine 4473933 [IO wait]:
    goroutine 4473732 [chan receive]:
    goroutine 4473623 [select]:
    goroutine 4473567 [select]:
    goroutine 4473894 [select]:
    goroutine 4473531 [chan receive]:
    goroutine 4473608 [IO wait]:
    goroutine 4473605 [select]:
    goroutine 4473768 [chan receive]:
    goroutine 4473432 [IO wait]:
    goroutine 4473665 [IO wait]:
    goroutine 4473781 [IO wait]:
    goroutine 4473554 [chan receive]:
    goroutine 4473889 [IO wait]:
    goroutine 4473739 [chan receive]:
    goroutine 4473616 [IO wait]:
    goroutine 4473517 [chan receive]:
    goroutine 4473720 [IO wait]:
    goroutine 4473621 [select]:
    goroutine 4473755 [select]:
    goroutine 4473551 [chan receive]:
    goroutine 4473653 [select]:
    goroutine 4473750 [IO wait]:
    goroutine 4473716 [IO wait]:
    goroutine 4473434 [IO wait]:
    goroutine 4473562 [chan receive]:
    goroutine 4473646 [IO wait]:
    goroutine 4473674 [IO wait]:
    goroutine 4473776 [chan receive]:
    goroutine 4473571 [select]:
    goroutine 4473802 [select]:
    goroutine 4451151 [chan receive]:
    goroutine 4473836 [IO wait]:
    goroutine 4473934 [select]:
    goroutine 4473826 [select]:
    goroutine 4473421 [IO wait]:
    goroutine 4473844 [IO wait]:
    goroutine 4473692 [select]:
    goroutine 4473898 [select]:
    goroutine 4473617 [select]:
    goroutine 4473880 [select]:
    goroutine 4473555 [chan receive]:
    goroutine 4451267 [IO wait]:
    goroutine 4473670 [select]:
    goroutine 4473723 [select]:
    goroutine 4473607 [select]:
    goroutine 4473909 [IO wait]:
    goroutine 4473424 [IO wait]:
    goroutine 4473751 [select]:
    goroutine 4473780 [select]:
    goroutine 4473885 [IO wait]:
    goroutine 4459735 [select]:
    goroutine 4473588 [IO wait]:
    goroutine 4473729 [chan receive]:
    goroutine 4473890 [select]:
    goroutine 4473743 [chan receive]:
    goroutine 4473686 [select]:
    goroutine 4473728 [chan receive]:
    goroutine 4473592 [IO wait]:
    goroutine 4473772 [chan receive]:
    goroutine 4473832 [select]:
    goroutine 4473886 [select]:
    goroutine 4473804 [select]:
    goroutine 4473690 [select]:
    goroutine 4473841 [select]:
    goroutine 4473917 [chan receive]:
    goroutine 4473916 [select]:
    goroutine 4473569 [select]:
    goroutine 4473596 [IO wait]:
    goroutine 4473580 [IO wait]:
    goroutine 4473735 [chan receive]:
    goroutine 4473717 [select]:
    goroutine 4473795 [IO wait]:
    goroutine 4473773 [chan receive]:
    goroutine 4473657 [select]:
    goroutine 4473597 [select]:
    goroutine 4473515 [chan receive]:
    goroutine 4473565 [select]:
    goroutine 4473852 [IO wait]:
    goroutine 4473864 [IO wait]:
    goroutine 4473613 [select]:
    goroutine 4473845 [select]:
    goroutine 4473893 [IO wait]:
    goroutine 4473709 [select]:
    goroutine 4473828 [select]:
    goroutine 4473514 [chan receive]:
    goroutine 4473660 [select]:
    goroutine 4473600 [IO wait]:
    goroutine 4473742 [chan receive]:
    goroutine 4473604 [IO wait]:
    goroutine 4473896 [select]:
    goroutine 4473777 [IO wait]:
    goroutine 4473940 [select]:
    goroutine 4473929 [select]:
    goroutine 4473817 [select]:
    goroutine 4473851 [select]:
    goroutine 4473591 [select]:
    goroutine 4473782 [select]:
    goroutine 4473431 [IO wait]:
    goroutine 4473771 [chan receive]:
    goroutine 4473662 [select]:
    goroutine 4473731 [chan receive]:
    goroutine 4473650 [IO wait]:
    goroutine 4473593 [select]:
    goroutine 4473603 [select]:
    goroutine 4473585 [select]:
    goroutine 4473823 [select]:
    goroutine 4473757 [select]:
    goroutine 4473884 [select]:
    goroutine 4473829 [IO wait]:
    goroutine 4473877 [IO wait]:
    goroutine 4473416 [IO wait]:
    goroutine 4473775 [chan receive]:
    goroutine 4473649 [select]:
    goroutine 4473787 [select]:
    goroutine 4473902 [select]:
    goroutine 4473519 [chan receive]:
    goroutine 4473558 [chan receive]:
    goroutine 4473641 [chan receive]:
    goroutine 4473422 [IO wait]:
    goroutine 4473861 [select]:
    goroutine 4473936 [select]:
    goroutine 4473589 [select]:
    goroutine 4473644 [chan receive]:
    goroutine 4473767 [chan receive]:
    goroutine 4473557 [chan receive]:
    goroutine 4473904 [select]:
    goroutine 4473710 [IO wait]:
    goroutine 4473677 [select]:
    goroutine 4473878 [select]:
    goroutine 4473905 [IO wait]:
    goroutine 4473620 [IO wait]:
    goroutine 4473599 [select]:
    goroutine 4473539 [chan receive]:
    goroutine 4473789 [select]:
    goroutine 4473659 [IO wait]:
    goroutine 4473810 [select]:
    goroutine 4473666 [select]:
    goroutine 4473901 [IO wait]:
    goroutine 4473733 [chan receive]:
    goroutine 4473798 [select]:
    goroutine 4473573 [select]:
    goroutine 4473867 [select]:
    goroutine 4473579 [select]:
    goroutine 4473738 [chan receive]:
    goroutine 4473853 [select]:
    goroutine 4473685 [IO wait]:
    goroutine 4473762 [select]:
    goroutine 4473737 [chan receive]:
    goroutine 4473707 [select]:
    goroutine 4473938 [chan receive]:
    goroutine 4473642 [chan receive]:
    goroutine 4473791 [IO wait]:
    goroutine 4473760 [select]:
    goroutine 4473749 [select]:
    goroutine 4473786 [IO wait]:
    goroutine 4473540 [chan receive]:
    goroutine 4473825 [IO wait]:
    goroutine 4473575 [select]:
    goroutine 4473572 [IO wait]:
    goroutine 4473587 [select]:
    goroutine 4473576 [IO wait]:
    goroutine 4473769 [chan receive]:
    goroutine 4473746 [IO wait]:
    goroutine 4473881 [IO wait]:
    goroutine 4473800 [select]:
    goroutine 4473792 [select]:
    goroutine 4473736 [chan receive]:
    goroutine 4459732 [IO wait]:
    goroutine 4473843 [select]:
    goroutine 4473939 [IO wait]:
    goroutine 4473688 [select]:
    goroutine 4473433 [IO wait]:
    goroutine 4473647 [select]:
    goroutine 4473425 [IO wait]:
    goroutine 4473897 [IO wait]:
    goroutine 4473860 [IO wait]:
    goroutine 4473711 [select]:
    goroutine 4473882 [select]:
    goroutine 4473679 [IO wait]:
    goroutine 4473796 [select]:
    goroutine 4473734 [chan receive]:
    goroutine 4473537 [chan receive]:
    goroutine 4473809 [IO wait]:
    goroutine 4473812 [select]:
    goroutine 4473564 [IO wait]:
    goroutine 4473669 [IO wait]:
    goroutine 4473609 [select]:
    goroutine 4473429 [IO wait]:
    goroutine 4473520 [chan receive]:
    goroutine 4473559 [chan receive]:
    goroutine 4473855 [select]:
    goroutine 4473721 [select]:
    goroutine 4473518 [chan receive]:
prattmic commented 1 year ago

I have been considering if it is possible that the netpoll part in findRunnable is just pushing readied G's to the global run queue because it couldn't find an idle P (with injectglist), then it goes straight to stopm, and up the stack checkdead finds them with no active M.

Normally I'd say it seems unlikely to lose a race against all of the Ps like this, but since this is GOMAXPROCS=1 that makes it more plausible.

Currently working on obtaining a core dump.

A core dump should provide more insight. Consider building from a tip toolchain and setting GODEBUG=dontfreezetheworld=1, which was just added a few weeks ago. Normally the scheduler tries to stop all goroutines when throwing, which can obscure the scheduler state in a core dump. This flag will skip that behavior.

Setting GODEBUG=schedtrace=1000 will also add a scheduler state dump on throw, not dissimilar from the logging you added. (Multiple GODEBUG options can be set, comma separated)

I don't suppose you can share a program that reproduces this behavior? Do you know if this reproduced on older versions of Go?

prattmic commented 1 year ago

This happens on m3.medium AWS instances. I could not reproduce it anywhere else so far. This is a monothreaded amd64 instance

To clarify, you cannot reproduce elsewhere even when explicitly setting GOMAXPROCS=1?

lcoffe-botify commented 1 year ago

To clarify, you cannot reproduce elsewhere even when explicitly setting GOMAXPROCS=1?

I was not able to reproduce it on other instance types so far (ie forcing GOMAXPROCS=1 in a multithreaded environment). I should try again with things I learned in the interim.

I don't suppose you can share a program that reproduces this behavior? Do you know if this reproduced on older versions of Go?

I have no minimal reproducer that I can share yet. I did not try to switch to older versions yet, as anyway I couldn't go past 1.19 (would that still be useful?).

A core dump should provide more insight. Consider building from a tip toolchain and setting GODEBUG=dontfreezetheworld=1, which was just added a few weeks ago. Normally the scheduler tries to stop all goroutines when throwing, which can obscure the scheduler state in a core dump. This flag will skip that behavior.

Thank you for your suggestions. I tried to get a core dump with GODEBUG=dontfreezetheworld=1, but the process did hang. The checkdead debug line was printed, and the process just stay there. After ~40min, I concluded it won't quit, so I got a core dump with gcore, and killed it. That core dump did not help.

Still looking at the core dumps I got so far. I am trying to figure out how to share more context

lcoffe-botify commented 1 year ago

After reviewing enough core dumps, I could see that the threads are consistently in a state that indicates a race on shed.lock. This issue stood out, although it is not the same race condition happening. According to the stack traces below, M1 is crashing in checkdead, and M2 is locked on acquiring sched.lock in startm. Let's keep this definition of M1 and M2 for later reference. Line number are off, explained later on:

M1

 0  0x000000000046f021 in runtime.raise
    at /usr/local/go/src/runtime/sys_linux_amd64.s:154
 1  0x000000000044f0a5 in runtime.dieFromSignal
    at /usr/local/go/src/runtime/signal_unix.go:879
 2  0x000000000044f646 in runtime.sigfwdgo
    at /usr/local/go/src/runtime/signal_unix.go:1092
 3  0x000000000044dd67 in runtime.sigtrampgo
    at /usr/local/go/src/runtime/signal_unix.go:432
 4  0x000000000046f326 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_amd64.s:354
 5  0x000000000046f420 in runtime.sigreturn
    at /usr/local/go/src/runtime/sys_linux_amd64.s:468
 6  0x000000000044f2b8 in runtime.crash
    at /usr/local/go/src/runtime/signal_unix.go:971
 7  0x0000000000437851 in runtime.fatalthrow.func1
    at /usr/local/go/src/runtime/panic.go:1127
 8  0x00000000004377cc in runtime.fatalthrow
    at /usr/local/go/src/runtime/panic.go:1120
 9  0x000000000043747d in runtime.throw
    at /usr/local/go/src/runtime/panic.go:1047
10  0x0000000000445dcb in runtime.checkdead.func1
    at /usr/local/go/src/runtime/proc.go:5397
11  0x000000000043ac3c in runtime.forEachG
    at /usr/local/go/src/runtime/proc.go:591
12  0x000000000044594f in runtime.checkdead
    at /usr/local/go/src/runtime/proc.go:5376
13  0x0000000000447233 in runtime.mput
    at /usr/local/go/src/runtime/proc.go:5968
14  0x000000000043e389 in runtime.stopm
    at /usr/local/go/src/runtime/proc.go:2382

M2 (always waking up scavengerState)

 0  0x000000000046f5c3 in runtime.futex
    at /usr/local/go/src/runtime/sys_linux_amd64.s:555
 1  0x0000000000433556 in runtime.futexsleep
    at /usr/local/go/src/runtime/os_linux.go:69
 2  0x000000000040c2d9 in runtime.lock2
    at /usr/local/go/src/runtime/lock_futex.go:107
 3  0x000000000043e4d4 in runtime.lockWithRank
    at /usr/local/go/src/runtime/lockrank_off.go:22
 4  0x000000000043e4d4 in runtime.lock
    at /usr/local/go/src/runtime/lock_futex.go:48
 5  0x000000000043e4d4 in runtime.startm
    at /usr/local/go/src/runtime/proc.go:2428
 6  0x0000000000465732 in runtime.injectglist.func1
    at /usr/local/go/src/runtime/proc.go:3422
 7  0x0000000000440cfc in runtime.injectglist
    at /usr/local/go/src/runtime/proc.go:3436
 8  0x0000000000422725 in runtime.(*scavengerState).wake
    at /usr/local/go/src/runtime/mgcscavenge.go:428
 9  0x0000000000446205 in runtime.sysmon
    at /usr/local/go/src/runtime/proc.go:5655
10  0x000000000043cb93 in runtime.mstart1
    at /usr/local/go/src/runtime/proc.go:1513
11  0x000000000043cada in runtime.mstart0
    at /usr/local/go/src/runtime/proc.go:1470
12  0x000000000046b525 in runtime.mstart
    at /usr/local/go/src/runtime/asm_amd64.s:395
13  0x000000000046f66f in runtime.clone
    at /usr/local/go/src/runtime/sys_linux_amd64.s:615

I concluded that something is blocking M2 from starting a M before stopm is called in M1. To understand what's happening with the lock before entering this state, I added traces around all lock(&sched.lock) and unlock(&sched.lock).. Here is the trace when checkdead panics (not using the exact commit here but using the original throw in checkdead to get a coredump):

eventtrace: t=45456676411335, g=0 m=3, findRunnable acquiring sched lock (2)
eventtrace: t=45456676412160, g=0 m=3, findRunnable acquired sched lock (2)
eventtrace: t=45456676412966, g=0 m=3, findRunnable releasing sched lock (2)
eventtrace: t=45456676413678, g=0 m=3, findRunnable released sched lock (2)
eventtrace: t=45456676431199, g=0 m=3, findRunnable acquiring sched lock (3)
eventtrace: t=45456676431945, g=0 m=3, findRunnable acquired sched lock (3)
eventtrace: t=45456676433067, g=0 m=3, findRunnable releasing sched lock (6)
eventtrace: t=45456676433798, g=0 m=3, findRunnable released sched lock (6)
eventtrace: t=45456676664081, g=0 m=3, findRunnable acquiring sched lock (4)
eventtrace: t=45456676665287, g=0 m=3, findRunnable acquired sched lock (4)
eventtrace: t=45456676666597, g=0 m=3, findRunnable releasing sched lock (7)
eventtrace: t=45456676667355, g=0 m=3, findRunnable released sched lock (7)
eventtrace: t=45456676697198, g=0 m=3, findRunnable acquiring sched lock (3)
eventtrace: t=45456676698169, g=0 m=3, findRunnable acquired sched lock (3)
eventtrace: t=45456676699049, g=0 m=3, findRunnable releasing sched lock (6)
eventtrace: t=45456676699769, g=0 m=3, findRunnable released sched lock (6)
eventtrace: t=45456690354570, g=0 m=3, findRunnable acquiring sched lock (4)
eventtrace: t=45456690355906, g=0 m=3, findRunnable acquired sched lock (4)
eventtrace: t=45456690357369, g=0 m=3, findRunnable releasing sched lock (7)
eventtrace: t=45456690358155, g=0 m=3, findRunnable released sched lock (7)
eventtrace: t=45456690401389, g=0 m=3, findRunnable acquiring sched lock (3)
eventtrace: t=45456690402194, g=0 m=3, findRunnable acquired sched lock (3)
eventtrace: t=45456690403173, g=0 m=3, findRunnable releasing sched lock (6)
eventtrace: t=45456690403902, g=0 m=3, findRunnable released sched lock (6)
eventtrace: t=45456690411975, g=0 m=1, sysmon acquiring sched lock (2)
eventtrace: t=45456690412994, g=0 m=1, sysmon acquired sched lock (2)
eventtrace: t=45456690414262, g=0 m=1, sysmon releasing sched lock (2)
eventtrace: t=45456690414987, g=0 m=1, sysmon released sched lock (2)
eventtrace: t=45456714142246, g=0 m=3, findRunnable acquiring sched lock (4)
eventtrace: t=45456714144224, g=0 m=3, findRunnable acquired sched lock (4)
eventtrace: t=45456714146004, g=0 m=3, findRunnable releasing sched lock (7)
eventtrace: t=45456714146984, g=0 m=3, findRunnable released sched lock (7)
eventtrace: t=45456714152494, g=0 m=3, entersyscall_sysmon acquiring sched lock
eventtrace: t=45456714153484, g=0 m=3, entersyscall_sysmon acquired sched lock
eventtrace: t=45456714157218, g=0 m=3, entersyscall_sysmon releasing sched lock
eventtrace: t=45456714158047, g=0 m=3, entersyscall_sysmon released sched lock
eventtrace: t=45456714211610, g=0 m=3, findRunnable acquiring sched lock (3)
eventtrace: t=45456714212617, g=0 m=3, findRunnable acquired sched lock (3)
eventtrace: t=45456714213942, g=0 m=3, findRunnable releasing sched lock (6)
eventtrace: t=45456714214770, g=0 m=3, findRunnable released sched lock (6)
eventtrace: t=45456714222397, g=0 m=1, sysmon acquiring sched lock (3)
eventtrace: t=45456714223447, g=0 m=1, sysmon acquired sched lock (3)
eventtrace: t=45456714224331, g=0 m=1, sysmon releasing sched lock (3)
eventtrace: t=45456714225123, g=0 m=1, sysmon released sched lock (3)
eventtrace: t=45456714231759, g=0 m=1, injectglist acquiring sched lock (2)
eventtrace: t=45456714232704, g=0 m=1, injectglist acquired sched lock (2)
eventtrace: t=45456714233694, g=0 m=1, injectglist releasing sched lock (3)
eventtrace: t=45456714234466, g=0 m=1, injectglist released sched lock (3)
eventtrace: t=45456714235606, g=0 m=1, injectglist acquiring sched lock (1)
eventtrace: t=45456714236386, g=0 m=1, injectglist acquired sched lock (1)
eventtrace: t=45456714237981, g=0 m=1, injectglist releasing sched lock (2)
eventtrace: t=45456714238734, g=0 m=1, injectglist released sched lock (2)
eventtrace: t=45456722489114, g=0 m=3, findRunnable acquiring sched lock (4)
eventtrace: t=45456722490381, g=0 m=3, findRunnable acquired sched lock (4)
eventtrace: t=45456722491302, g=0 m=3, findRunnable releasing sched lock (7)
eventtrace: t=45456722492095, g=0 m=3, findRunnable released sched lock (7)
eventtrace: t=45456722493297, g=0 m=3, injectglist acquiring sched lock (2)
eventtrace: t=45456722494024, g=0 m=3, injectglist acquired sched lock (2)
eventtrace: t=45456722494834, g=0 m=3, injectglist releasing sched lock (3)
eventtrace: t=45456722495587, g=0 m=3, injectglist released sched lock (3)
eventtrace: t=45456722496541, g=0 m=3, injectglist acquiring sched lock (1)
eventtrace: t=45456722497259, g=0 m=3, injectglist acquired sched lock (1)
eventtrace: t=45456722498017, g=0 m=3, injectglist releasing sched lock (1)
eventtrace: t=45456722498746, g=0 m=3, injectglist released sched lock (1)
eventtrace: t=45456722499544, g=0 m=3, findRunnable: no runnable G
eventtrace: t=45456722500515, g=0 m=3, stopm acquiring sched lock
eventtrace: t=45456722501299, g=0 m=3, stopm acquired sched lock

Here, m=1 is our M2 (scavengerState waking up), and m=3 is our M1 (checkdead panic).

It is a little clearer what's happening here: M2 is stealing P (remember, monothreaded instance) from M1, then yielding somehow, making M1 push runnable goroutines from netpoll to the global run queue, and proceed to panic in checkdead as no M is available, M2 being stuck trying to acquire sched.lock.

It looks like M2 is preempted before being able to reacquire the scheduler lock in startm. But looking in injectglist, it seems like preemption is already disabled before calling startm, as intended:

    startIdle := func(n int) {
        for i := 0; i < n; i++ {
            mp := acquirem() // See comment in startm.
            pushEventTrace("injectglist acquiring sched lock (1)")
            lock(&sched.lock)
            pushEventTrace("injectglist acquired sched lock (1)")

            pp, _ := pidlegetSpinning(0) // stealing the P from M1 here
            if pp == nil {
                pushEventTrace("injectglist releasing sched lock (1)")
                unlock(&sched.lock)
                pushEventTrace("injectglist released sched lock (1)")
                releasem(mp)
                break
            }

            pushEventTrace("injectglist releasing sched lock (2)")
            unlock(&sched.lock)
            pushEventTrace("injectglist released sched lock (2)")
            startm(pp, false) // preempted?? mp.locks=5 at this stage
            releasem(mp)
        }
    }

What could be happening here?

I am trying to solve this by making sure injectglist does not release the lock before calling startm. Waiting for results...

prattmic commented 1 year ago

Thanks for continuing to dig. I'll look more closely at your logs in a bit.

I will note that checkdead() is called with sched.lock held, so it is possible that threads stuck on sched.lock are a bit of a red herring that are just there because checkdead is holding the lock. OTOH, I could believe there is a real bug here.

Nice event trace log you added, that looks nice! For future reference, the runtime has similar functionality built in in runtime/debuglog.go (build with -tags debuglog to enable). Use looks like:

d := dlog()
if d != nil {
  d.s("foo")
  d.i(42)
  d.end()
}

The buffer will be dumped on throw.

lcoffe-botify commented 1 year ago

I am relatively new to the runtime package and would have appreciated discovering debuglog earlier (although I did learn about go's write barriers in the process 🤦).

I agree that the state of M2 in the stack trace does not accurately represent the crash. I think it might not even be entering startm, although this detail is far from crucial.

We may have identified the issue. I previously mentioned a previous fix regarding the conflicting relationship between startm and checkdead. The solution to our issue can be found within the startm code:

    nmp := mget()
    if nmp == nil {
        // No M is available, we must drop sched.lock and call newm.
        // However, we already own a P to assign to the M.
        //
        // Once sched.lock is released, another G (e.g., in a syscall),
        // could find no idle P while checkdead finds a runnable G but
        // no running M's because this new M hasn't started yet, thus
        // throwing in an apparent deadlock.
        //
        // Avoid this situation by pre-allocating the ID for the new M,
        // thus marking it as 'running' before we drop sched.lock. This
        // new M will eventually run the scheduler to execute any
        // queued G's.
        id := mReserveID()
        unlock(&sched.lock)

        var fn func()
        if spinning {
            // The caller incremented nmspinning, so set m.spinning in the new M.
            fn = mspinning
        }
        newm(fn, pp, id)

        // Ownership transfer of pp committed by start in newm.
        // Preemption is now safe.
        releasem(mp)
        return
    }

The crux of the issue is that if the lock is not acquired at the beginning of startm because another G steals it, the same problem can occur when checkdead finds no available M's.

I propose a fix that includes introducing a new lockheld parameter to startm. This parameter governs all lock and unlock calls within startm. Additionally, a slight modification to the "newm case" is made to reacquire the lock, ensuring the caller's expected state is maintained.

// in our case, the caller (injectglist) acquires the sched lock and calls startm with lockheld=true

func startm(pp *p, spinning, lockheld bool) {
    mp := acquirem()

    if !lockheld {
        lock(&sched.lock)
    }

    [...]

    nmp := mget()
    if nmp == nil {
        id := mReserveID()
        unlock(&sched.lock) // <- unlock no matter what

        var fn func()
        if spinning {
            // The caller incremented nmspinning, so set m.spinning in the new M.
            fn = mspinning
        }
        newm(fn, pp, id)

        if lockheld {
            lock(&sched.lock) // <- reacquire lock if necessary
        }

        releasem(mp)
        return
    }

All existing references to startm should pass false as lockheld, except for one specific case: the startIdle function variable in injectglist would call startm without releasing the lock:

    startIdle := func(n int) {
        for i := 0; i < n; i++ {
            mp := acquirem() // See comment in startm.
            lock(&sched.lock)

            pp, _ := pidlegetSpinning(0)
            if pp == nil {
                unlock(&sched.lock)
                releasem(mp)
                break
            }

            startm(pp, false, true)
            unlock(&sched.lock)
            releasem(mp)
        }
    }

I am still validating this fix.

prattmic commented 1 year ago

Great find! I think you are right, this sounds like an issue. Would you like to send a CL?

lcoffe-botify commented 1 year ago

Sure! I am sending a CL

Regarding my previous attempt with GODEBUG=dontfreezetheworld=1, I revisited the coredump, and it appears that dontfreezetheworld is not the (direct) culprit. Instead, a deadlock in this specific case prevented the program from crashing and generating the coredump.

With schedtrace or scheddebug, an unusual behavior emerges. startpanic_m calls schedtrace(true) before checking for debug.dontfreezetheworld. However, schedtrace(true) tries to acquire sched.lock, which is already held in the checkdead context. I'm uncertain about what's happening, but this might be worth investigating further.

Here's the stacktrace after around 40 minutes of deadlock in this specific instance:

 0  0x00000000004751c3 in runtime.futex
    at /Users/luciencoffe/sdk/gotip/src/runtime/sys_linux_amd64.s:558
 1  0x00000000004390b0 in runtime.futexsleep
    at /Users/luciencoffe/sdk/gotip/src/runtime/os_linux.go:69
 2  0x000000000040bb45 in runtime.lock2
    at /Users/luciencoffe/sdk/gotip/src/runtime/lock_futex.go:107
 3  0x000000000040c373 in runtime.lockWithRank
    at /Users/luciencoffe/sdk/gotip/src/runtime/lockrank_off.go:22
 4  0x000000000040baa5 in runtime.lock
    at /Users/luciencoffe/sdk/gotip/src/runtime/lock_futex.go:48
 5  0x000000000044a6e8 in runtime.schedtrace
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:5645
 6  0x000000000043d9f0 in runtime.startpanic_m
    at /Users/luciencoffe/sdk/gotip/src/runtime/panic.go:1243
 7  0x000000000043d6ee in runtime.fatalthrow.func1
    at /Users/luciencoffe/sdk/gotip/src/runtime/panic.go:1150
 8  0x000000000043d685 in runtime.fatalthrow
    at /Users/luciencoffe/sdk/gotip/src/runtime/panic.go:1149
 9  0x000000000043d37c in runtime.throw
    at /Users/luciencoffe/sdk/gotip/src/runtime/panic.go:1076
10  0x0000000000449ee5 in runtime.checkdead.func1
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:5294
11  0x0000000000440811 in runtime.forEachG
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:607
12  0x0000000000449bd7 in runtime.checkdead
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:5281
13  0x000000000044b169 in runtime.mput
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:5774
14  0x000000000044391c in runtime.stopm
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:2410
15  0x0000000000444430 in runtime.findRunnable
15  0x0000000000444430 in runtime.findRunnable
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:3085
16  0x0000000000445afd in runtime.schedule
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:3438
17  0x0000000000446019 in runtime.park_m
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:3589
18  0x00000000004714ee in runtime.mcall
    at /Users/luciencoffe/sdk/gotip/src/runtime/asm_amd64.s:458

Another thread is stuck attempting to acquire the lock, specifically the injectglist/startm thread:

 0  0x00000000004751c3 in runtime.futex
    at /Users/luciencoffe/sdk/gotip/src/runtime/sys_linux_amd64.s:558
 1  0x00000000004390b0 in runtime.futexsleep
    at /Users/luciencoffe/sdk/gotip/src/runtime/os_linux.go:69
 2  0x000000000040bb45 in runtime.lock2
    at /Users/luciencoffe/sdk/gotip/src/runtime/lock_futex.go:107
 3  0x000000000040c373 in runtime.lockWithRank
    at /Users/luciencoffe/sdk/gotip/src/runtime/lockrank_off.go:22
 4  0x000000000040baa5 in runtime.lock
    at /Users/luciencoffe/sdk/gotip/src/runtime/lock_futex.go:48
 5  0x0000000000443a16 in runtime.startm
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:2452
 6  0x000000000046aed8 in runtime.injectglist.func1
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:3373
 7  0x000000000044593a in runtime.injectglist
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:3383
 8  0x0000000000425f3d in runtime.(*scavengerState).wake
    at /Users/luciencoffe/sdk/gotip/src/runtime/mgcscavenge.go:427
 9  0x000000000044a247 in runtime.sysmon
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:5476
10  0x0000000000442473 in runtime.mstart1
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:1507
11  0x00000000004423c5 in runtime.mstart0
    at /Users/luciencoffe/sdk/gotip/src/runtime/proc.go:1464
12  0x0000000000471465 in runtime.mstart
    at /Users/luciencoffe/sdk/gotip/src/runtime/asm_amd64.s:395
13  0x000000000047526f in runtime.clone
    at /Users/luciencoffe/sdk/gotip/src/runtime/sys_linux_amd64.s:618
gopherbot commented 1 year ago

Change https://go.dev/cl/487316 mentions this issue: runtime: resolve checkdead panic by refiningstartmlock handling in caller context

prattmic commented 1 year ago

Regarding schedtrace, we have this problem in a few other places that throw can be called in nearly any context with arbitrary locks held.

The throws in checkdead should unlock sched.lock before throwing. The final one does already (https://cs.opensource.google/go/go/+/master:src/runtime/proc.go;l=5258;bpv=1;bpt=1?), but not the others.

lcoffe-botify commented 1 year ago

submitted https://go.dev/cl/487316 for #59758

SuperQ commented 1 year ago

Is it possible to get this bugfix backported to 1.20? It is affecting users of Prometheus monitoring as we default GOMAXPROCS=1 for the node_exporter to reduce problems with data races in the Linux kernel.

ianlancetaylor commented 1 year ago

@gopherbot Please open a backport issue to 1.20.

Quoting @SuperQ :

Is it possible to get this bugfix backported to 1.20? It is affecting users of Prometheus monitoring as we default GOMAXPROCS=1 for the node_exporter to reduce problems with data races in the Linux kernel.

gopherbot commented 1 year ago

Backport issue(s) opened: #60760 (for 1.20).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

prattmic commented 1 year ago

@gopherbot Please open a backport to 1.19. This issue is not new in 1.20, and potentially affects users with GOMAXPROCS=1.

gopherbot commented 1 year ago

Change https://go.dev/cl/504395 mentions this issue: [release-branch.go1.20] runtime: resolve checkdead panic by refiningstartmlock handling in caller context

gopherbot commented 1 year ago

Change https://go.dev/cl/504396 mentions this issue: [release-branch.go1.19] runtime: resolve checkdead panic by refiningstartmlock handling in caller context