polarsignals / frostdb

❄️ Coolest database around 🧊 Embeddable column database written in Go.
Apache License 2.0
1.27k stars 65 forks source link

Data race when opening database #879

Open thorfour opened 2 months ago

thorfour commented 2 months ago
2024-05-22T13:41:38.6474307Z ==================
2024-05-22T13:41:38.6474647Z WARNING: DATA RACE
2024-05-22T13:41:38.6475059Z Write at 0x00c0000c9e88 by goroutine 2933:
2024-05-22T13:41:38.6475656Z   github.com/polarsignals/frostdb.(*ColumnStore).DB.func2()
2024-05-22T13:41:38.6476329Z       /home/runner/work/frostdb/frostdb/db.go:557 +0x5dd
2024-05-22T13:41:38.6476992Z   github.com/polarsignals/frostdb.(*ColumnStore).DB()
2024-05-22T13:41:38.6477625Z       /home/runner/work/frostdb/frostdb/db.go:578 +0xeb0
2024-05-22T13:41:38.6478302Z   github.com/polarsignals/frostdb.(*ColumnStore).recoverDBsFromStorage.func1()
2024-05-22T13:41:38.6479134Z       /home/runner/work/frostdb/frostdb/db.go:341 +0x19c
2024-05-22T13:41:38.6479690Z   golang.org/x/sync/errgroup.(*Group).Go.func1()
2024-05-22T13:41:38.6480550Z       /home/runner/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x97
2024-05-22T13:41:38.6481013Z 
2024-05-22T13:41:38.6481246Z Previous read at 0x00c0000c9e88 by goroutine 3017:
2024-05-22T13:41:38.6482169Z   github.com/polarsignals/frostdb.(*Table).writeBlock.func1()
2024-05-22T13:41:38.6482915Z       /home/runner/work/frostdb/frostdb/table.go:529 +0x41c
2024-05-22T13:41:38.6483523Z   github.com/polarsignals/frostdb.(*Table).writeBlock()
2024-05-22T13:41:38.6484197Z       /home/runner/work/frostdb/frostdb/table.go:548 +0xbe4
2024-05-22T13:41:38.6484870Z   github.com/polarsignals/frostdb.(*DB).recover.func2.3()
2024-05-22T13:41:38.6485520Z       /home/runner/work/frostdb/frostdb/db.go:847 +0x64
2024-05-22T13:41:38.6485849Z 
2024-05-22T13:41:38.6486043Z Goroutine 2933 (running) created at:
2024-05-22T13:41:38.6486561Z   golang.org/x/sync/errgroup.(*Group).Go()
2024-05-22T13:41:38.6487345Z       /home/runner/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:75 +0x124
2024-05-22T13:41:38.6488165Z   github.com/polarsignals/frostdb.(*ColumnStore).recoverDBsFromStorage()
2024-05-22T13:41:38.6488964Z       /home/runner/work/frostdb/frostdb/db.go:339 +0x564
2024-05-22T13:41:38.6489516Z   github.com/polarsignals/frostdb.New()
2024-05-22T13:41:38.6490056Z       /home/runner/work/frostdb/frostdb/db.go:142 +0xc84
2024-05-22T13:41:38.6490712Z   github.com/polarsignals/frostdb.TestDBRecover.func7()
2024-05-22T13:41:38.6491612Z       /home/runner/work/frostdb/frostdb/db_test.go:1420 +0x86c
2024-05-22T13:41:38.6492213Z   testing.tRunner()
2024-05-22T13:41:38.6492852Z       /opt/hostedtoolcache/go/1.21.10/x64/src/testing/testing.go:1595 +0x261
2024-05-22T13:41:38.6493406Z   testing.(*T).Run.func1()
2024-05-22T13:41:38.6494097Z       /opt/hostedtoolcache/go/1.21.10/x64/src/testing/testing.go:1648 +0x44
2024-05-22T13:41:38.6494553Z 
2024-05-22T13:41:38.6494709Z Goroutine 3017 (finished) created at:
2024-05-22T13:41:38.6495238Z   github.com/polarsignals/frostdb.(*DB).recover.func2()
2024-05-22T13:41:38.6495946Z       /home/runner/work/frostdb/frostdb/db.go:847 +0xa5b
2024-05-22T13:41:38.6496561Z   github.com/polarsignals/frostdb/wal.(*FileWAL).Replay()
2024-05-22T13:41:38.6497209Z       /home/runner/work/frostdb/frostdb/wal/wal.go:642 +0xb45
2024-05-22T13:41:38.6497878Z   github.com/polarsignals/frostdb.(*DB).recover()
2024-05-22T13:41:38.6498498Z       /home/runner/work/frostdb/frostdb/db.go:764 +0x1370
2024-05-22T13:41:38.6499056Z   github.com/polarsignals/frostdb.(*DB).openWAL()
2024-05-22T13:41:38.6499747Z       /home/runner/work/frostdb/frostdb/db.go:660 +0x1e4
2024-05-22T13:41:38.6500329Z   github.com/polarsignals/frostdb.(*ColumnStore).DB.func2.1()
2024-05-22T13:41:38.6500992Z       /home/runner/work/frostdb/frostdb/db.go:547 +0x1fe
2024-05-22T13:41:38.6501667Z   github.com/polarsignals/frostdb.(*ColumnStore).DB.func2()
2024-05-22T13:41:38.6502319Z       /home/runner/work/frostdb/frostdb/db.go:549 +0x468
2024-05-22T13:41:38.6502967Z   github.com/polarsignals/frostdb.(*ColumnStore).DB()
2024-05-22T13:41:38.6503590Z       /home/runner/work/frostdb/frostdb/db.go:578 +0xeb0
2024-05-22T13:41:38.6504262Z   github.com/polarsignals/frostdb.(*ColumnStore).recoverDBsFromStorage.func1()
2024-05-22T13:41:38.6505106Z       /home/runner/work/frostdb/frostdb/db.go:341 +0x19c
2024-05-22T13:41:38.6505661Z   golang.org/x/sync/errgroup.(*Group).Go.func1()
2024-05-22T13:41:38.6506447Z       /home/runner/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x97
2024-05-22T13:41:38.6507172Z ==================

Spawning a writeblock routine when opening a database causes a data race. This was introduced in commit ef7d1fe

asubiotto commented 2 months ago

I can't seem to repro this on current main (7269cf6)

$ go test -race -run TestDBRecover/SnapshotOnRecovery -count 100 --failfast
PASS
ok      github.com/polarsignals/frostdb 53.776s

Nothing was changed here after the failed commit so this is strange. Was it another test?

thorfour commented 2 months ago

It was definitely that test. I can't repro on our laptops either. I think this is exacerbated by the slow CI environment that makes it more likely to happen there

asubiotto commented 2 months ago

We should be able to repro easily with dst and manually panicking if we try to log to a NopWAL (as discussed in the meeting). Will leave this on the backburner for now but planning to come back to it.