Open gandarez opened 1 year ago
Looks like the db file is corrupted. To skip the error, @gandarez could try passing PreLoadFreelist: false
but it is always loaded in RW mode. Can this restriction be removed?
https://github.com/etcd-io/bbolt/blob/3e560dbae20dcb078d50f928ef7d17f1a56a4413/db.go#L182-L183
Thanks @gandarez for raising this issue and sorry for the inconvenience. Copied the call stack from https://github.com/wakatime/wakatime-cli/issues/848 below.
The error message indicates that the meta page 0 might be corrupted (but the checksum is somehow correct). Is is possible to provide the db file? ( I saw your message neither get the db file
, but still want to double confirm).
Do you have a detailed step to reproduce this issue?
goroutine 1 [running]:
runtime/debug.Stack()
/opt/hostedtoolcache/go/1.19.6/x64/src/runtime/debug/stack.go:24 +0x65
github.com/wakatime/wakatime-cli/cmd.runCmd.func1()
/home/runner/work/wakatime-cli/wakatime-cli/cmd/run.go:272 +0xd3
panic({0x9a5540, 0xc00060b980})
/opt/hostedtoolcache/go/1.19.6/x64/src/runtime/panic.go:884 +0x212
go.etcd.io/bbolt.(*freelist).read(0x0?, 0x11bfa0c2000)
/home/runner/go/pkg/mod/go.etcd.io/bbolt@v1.3.7/freelist.go:267 +0x22e
go.etcd.io/bbolt.(*DB).loadFreelist.func1()
/home/runner/go/pkg/mod/go.etcd.io/bbolt@v1.3.7/db.go:415 +0xb8
sync.(*Once).doSlow(0xc000123608?, 0x10?)
/opt/hostedtoolcache/go/1.19.6/x64/src/sync/once.go:74 +0xc2
sync.(*Once).Do(...)
/opt/hostedtoolcache/go/1.19.6/x64/src/sync/once.go:65
go.etcd.io/bbolt.(*DB).loadFreelist(0xc000123440?)
/home/runner/go/pkg/mod/go.etcd.io/bbolt@v1.3.7/db.go:408 +0x47
go.etcd.io/bbolt.Open({0xc0002fd260, 0x1a}, 0x0?, 0xc000378c20)
/home/runner/go/pkg/mod/go.etcd.io/bbolt@v1.3.7/db.go:290 +0x40c
Or execute commands below if you can't provide the db file,
$ ./bbolt check <db-file>
$ ./bbolt pages <db-file>
$ ./bbolt page <db-file> 0
$ ./bbolt page <db-file> 1
@gandarez could try passing
PreLoadFreelist: false
but it is always loaded in RW mode
Note that bbolt always loads the freelist in write mode, no matter what value is set for PreLoadFreelist
.
EDIT:
Can this restriction be removed?
NO, we can't. Freelist management is the most crucial part of bbolt, and it's always needed in write mode, and definitely always necessary to load freelist in write mode.
Sorry, I didn't tell the whole thing. What I meant was, if the user switches NoFreelistSync
from false
to true
, db.Open()
still loads the freelist.
I'm proposing changing: https://github.com/etcd-io/bbolt/blob/3e560dbae20dcb078d50f928ef7d17f1a56a4413/db.go#L253-L255
to
if db.PreLoadFreelist && !db.NoFreeListSync {
db.loadFreelist()
}
It isn't correct. db.NoFreeListSync == false
only means not syncing freelist in this transaction; in other words, it doesn't mean not loading freelist. We still need to load freelist, even there is no synced freelist in previous transaction (bbolt will scan the whole db to reconstructure the freelist in this case).
Sorry for my misunderstanding. Currently, there is no way to skip loading freelist from the disk if meta page points to an existing freelist. Is that correct?
Currently, there is no way to skip loading freelist from the disk if meta page points to an existing freelist. Is that correct?
Correct. bbolt will always read from disk (either from synced freelist or scan the whole db to restructure the freelist) to get the freelist in write mode.
The most important thing for now is to reproduce the issue ourselves. It would be great if @gandarez can provide some clues.
I can't promise anything as I said it runs in our user's machines, but I'll try to get a copy of it.
With NoFreeListSync: false
, freelist is saved to a page and referenced from the meta page.
With NoFreeListSync: true
, freelist is not saved to the file and a special marker is put into the meta page.
Current freelist loading logic does not take NoFreeListSync
option into account.
https://github.com/etcd-io/bbolt/blob/e6563eef17d87c7e96e96fbb2b78be3e93d67ff1/db.go#L371-L383
By setting it to true
, the user of the library accepts that the freelist will not be saved to disk and accepts the latency for scanning whole db.
The loading behavior currently depends only on the existence of freelist on the db file.
I have a proposal for adding NoFreeListSync
into the decision:
if !db.hasSyncedFreelist() || db.NoFreeListSync {
// Reconstruct free list by scanning the DB.
db.freelist.readIDs(db.freepages())
} else {
// Read free list from freelist page.
db.freelist.read(db.page(db.meta().Freelist()))
}
This may help to open the database by changing an option if the corruption is just in the freelist.
@gandarez is there any update on this? thx
I haven't heard anything from nobody, is this issue still on track?
I haven't heard anything from nobody, is this issue still on track?
Based on all the info we have so far, most likely the db file is somehow corrupted. The suggestion I can think of for now is to regularly backup the db file [your application is a standalone client]. For distributed systems, single points of failure are usually tolerated.
It would be great if you can provide the db file next time when you run into similar issue, so that I can double check. I can also try to fix the corrupted db file using the surgery commands.
BTW, how many times have you run into such corruption issue in your application?
Running as a standalone application it's hard to say how many users were affected but it seems only one is still running with this issue. I tried to contact but didn't get any reply from them.
I've been using bbolt (already updated to latest version v1.3.7)since two years ago and started getting some weird panic when opening database file. I can't debug it neither get the db file to test it out since I distribute my application as a standalone client. Why it panics and do not return an error? Does that error happens because there's a corrupted db?
https://github.com/etcd-io/bbolt/blob/da2f2a53f6e2f25b215b79db2cd417488ef8e955/freelist.go#L265
https://github.com/wakatime/wakatime-cli/issues/848