Open fernando-az-alpaca opened 11 months ago
CC @bradfitz, @kardianos, @kevinburke.
We just hit this too. 🤦♂️
We had a fun bug recently where some code got refactored and introduced this bug, moving the Err check to the wrong spot:
func QueryJSONRow[T any](tx *Tx, query string, args ...any) (*T, error) {
rows, err := tx.Query(query, args...)
if err != nil {
return nil, err
}
defer rows.Close()
if !rows.Next() {
return nil, ErrNotFound
}
var j sql.RawBytes
if err := rows.Scan(&j); err != nil {
return nil, err
}
var ret T
if err := json.Unmarshal(j, &ret); err != nil {
return nil, err
}
if err := rows.Err(); err != nil { // <------ the bug
return nil, err
}
return &ret, nil
}
Note the rows.Err
call while the rows
are still open (no explicit Close
call yet and Next
has not reported false).
Unfortunately that used to always work (prior to the fix for #60304) and also mostly worked with our driver for a few days until we got unlucky goroutine timing wise, and it deadlocked:
1 @ 0x44396e 0x4568e5 0x4568b4 0x477845 0x97ad85 0x97ad60 0x19e943e 0x19a767a 0x19ca65e 0x19caba5 0x190aed0 0x190adf8 0x190af38 0x197951a 0x1979fc5 0x18e9c85 0x1e4b4da 0x1e4d66d 0x1e38aca 0x1e40405 0x1e36a0c 0x1e392e5 0x1e3569f 0x1d10cfa 0x13e2ac9 0x13e3d44 0x13e2ff4 0x12cd262 0x1d0e347 0x7a6149 0x1b591d5 0x1cc8f7b
# 0x477844 sync.runtime_SemacquireRWMutexR+0x24 runtime/sema.go:82
# 0x97ad84 sync.(*RWMutex).RLock+0x64 sync/rwmutex.go:70
# 0x97ad5f database/sql.(*Rows).Err+0x3f database/sql/sql.go:3122
/cc @maisem @andrew-d
Go version
go version go1.21.4 linux/arm64
What operating system and processor architecture are you using (
go env
)?What did you do?
Note: whether or not this is legitimate bug or a usage error depends on the answer to the question: "after this change https://github.com/golang/go/issues/60304#issuecomment-1560205463, is it an error to call
Rows.Err()
after a context cancellation but before callingRows.Close()
? " Thanks in advance for taking a look and I apologize if I've made any mistkaes.Note: a full docker-compose setup for the below code and additional required components is available at https://github.com/martonw/go-sql-deadlock.
What did you expect to see?
No deadlocks (assuming this pattern
is legitimate - and I'll add here that this pattern is used by the gorm package - and so potentially affects a wide audience - and is not as hard to hit upon as it might seem at first glance).
What did you see instead?
The code above will sometimes (but not always) hang/deadlock.