Closed rlofc closed 8 months ago
What exception do you see? If data inconsistency happens, I suspect either of that the process has crushed or that there is a bug.
Fragmentation is inevitable when you repeatedly write/delete records. Then, calling rebuild is a good idea to reduce the file size. You can call ShouldBeRebuilt for that purpose.
Anyhow, inconsistency and fragmentation are different things. If inconsistency is the matter, we should examine it in detail. If fragmentation is the matter, just calling Rebuild occasionally will do the job.
It's a bit challenging to get the originating exception, but the result is that I get a corrupt file with the following segfault after calling Get:
0x00007ffff7727e3e in tkrzw::TreeDBMImpl::SearchTree (this=this@entry=0x5555559464d0, key=...,
leaf_node=leaf_node@entry=0x7fffff7ff1a0) at tkrzw_dbm_tree.cc:1562
1562 Status TreeDBMImpl::SearchTree(std::string_view key, std::shared_ptr<TreeLeafNode>* leaf_node) {
(gdb) bt
#0 0x00007ffff7727e3e in tkrzw::TreeDBMImpl::SearchTree (this=this@entry=0x5555559464d0, key=...,
leaf_node=leaf_node@entry=0x7fffff7ff1a0) at tkrzw_dbm_tree.cc:1562
#1 0x00007ffff772f867 in tkrzw::TreeDBMImpl::Process (this=0x5555559464d0, key="EME56GHC",
proc=proc@entry=0x7fffff7ff260, writable=writable@entry=false) at tkrzw_dbm_tree.cc:575
#2 0x00007ffff772fb95 in tkrzw::TreeDBM::Process (this=this@entry=0x55555595d568, key=...,
proc=proc@entry=0x7fffff7ff260, writable=writable@entry=false) at tkrzw_dbm_tree.cc:2557
#3 0x00005555556aba52 in tkrzw::DBM::Get (value=0x0, key=..., this=0x55555595d568) at /usr/include/tkrzw_dbm.h:1041
#4 EntityRepo::exists (this=this@entry=0x55555595d568, uuid="EME56GHC") at ../server/repo.cc:188
Hi,
You talk about HashDBM but your stack is just about TreeDB. Another point, but not sure, the #3 frame value=0x0 is quite suspect
My 2ct
Le sam. 16 déc. 2023 à 16:35, Ithai Levi @.***> a écrit :
It's a bit challenging to get the originating exception, but the result is the I get a corrupt file with the following segfault after calling Get:
0x00007ffff7727e3e in tkrzw::TreeDBMImpl::SearchTree @.=0x5555559464d0, key=..., @.=0x7fffff7ff1a0) at tkrzw_dbm_tree.cc:15621562 Status TreeDBMImpl::SearchTree(std::string_view key, std::shared_ptr
* leaf_node) { (gdb) bt#0 0x00007ffff7727e3e in tkrzw::TreeDBMImpl::SearchTree @.=0x5555559464d0, key=..., @.=0x7fffff7ff1a0) at tkrzw_dbm_tree.cc:1562#1 0x00007ffff772f867 in tkrzw::TreeDBMImpl::Process (this=0x5555559464d0, key="EME56GHC", @.=0x7fffff7ff260, @.=false) at tkrzw_dbm_tree.cc:575#2 0x00007ffff772fb95 in tkrzw::TreeDBM::Process @.=0x55555595d568, key=..., @.=0x7fffff7ff260, @.=false) at tkrzw_dbm_tree.cc:2557#3 0x00005555556aba52 in tkrzw::DBM::Get (value=0x0, key=..., this=0x55555595d568) at /usr/include/tkrzw_dbm.h:1041#4 EntityRepo::exists @.=0x55555595d568, uuid="EME56GHC") at ../server/repo.cc:188 — Reply to this email directly, view it on GitHub https://github.com/estraier/tkrzw/issues/45#issuecomment-1858844978, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI3EBD4YP24AP3AY5FDD53YJW5TJAVCNFSM6AAAAABAXTNZ7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJYHA2DIOJXHA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
@rafal98 here's what I get from inspect:
Inspection:
class=HashDBM
healthy=false
auto_restored=false
path=kEFvdTUb.tkh
cyclic_magic=162
pkg_major_version=1
pkg_minor_version=0
static_flags=33
offset_width=4
align_pow=10
closure_flags=0
num_buckets=131101
num_records=26143
eff_data_size=35163370
file_size=56363008
timestamp=1702725550.719930
db_type=0
max_file_size=4398046511104
record_base=528384
update_mode=in-place
record_crc_mode=none
record_comp_mode=zstd
Actual File Size: 56363008
Number of Records: 26143
Healthy: false
Should be Rebuilt: false
Okay that's even weirder - the file is instantiated as tkrzw::TreeDBM
- and I have an AsyncDBM
in use too, but it is showing is HashDBM
in inspect.
Other than that issue - all works as intended.
EDIT - confirming that the file is most likely a TreeDBM
(although being reported by inspect as HashDBM
).
I just changed the code to open the file as HashDBM
and reading any record fails.
Closing this. I suspect the inconsistency issue I'm encountering is due to using AsyncDBM on top of TreeDBM and not employing error checking correctly (although, I'm not sure what is the best way to do so correctly using AsyncDBM - but that's for another issue). As for the crash, I'm 99.99% sure this is my bad. In any case, I think the answer by @estraier to my original question in the title is sufficient.
I'm seeing a weird behavior that I can so far only associate with a long running TreeDBM session with >100k write/delete ops of binary blobs. My files inflates and at some point, so I suspect, an exception from the library is thrown that creates an inconsistent state of the data.
Is it recommended to preemptively
Rebuild
the data file during a long-running session. So far, it appears to prevent this from happening - but I'm not 100% sure.