Closed AlexeyAkhunov closed 3 years ago
Perhaps this was caused by incompatible versions we used (DB was created by version from devel
branch, and reported corruption was by code from master
branch. When I tried to same with the devel
branch, I got no corruption on opening, but I could not write:
INFO [04-26|09:53:31.807] Opening Database (MDBX) mapSize="0 B"
panic: create bucket: clique-snapshots-, mdbx_dbi_open: operation not permitted
goroutine 1 [running]:
github.com/ledgerwatch/turbo-geth/turbo/node.RegisterEthService(...)
For the step above, the version of MDBX was #define MDBX_BUILD_SOURCERY bbcb0fd59ce19d2a8624d61216a54afc8ff2e62bc38e846a0ea2f96f3999bda5_v0_9_3_150_g31cfce4c
As I understand it this is the same problem that is specified in https://github.com/erthink/libmdbx/issues/187 However, some differences in information significantly change the essence. Please specify which version (i.e. the commit hash) you used for each run/step.
Nonetheless, the page mod-txnid (81721) > parent (81719)
error cannot be due to incompatibility with previous versions of mdbx_chk
.
For mdbx_dbi_open: operation not permitted
- this is a very unexpected error.
Are you sure that no other errors were returned before, that a read-write transaction is being used, or that the table being opened exists?
Are you sure that no other errors were returned before, that a read-write transaction is being used, or that the table being opened exists?
It tries to create the table (it shows as bucket
in our logs) when it hits is operation not permitted
. I will debug to check if the tx is in fact read-write
Please specify which version (i.e. the commit hash) you used for each run/step.
I've updated my comments with versions of MDBX used
Thanks for exactly version information.
devel
(3e272d339a336657d0032369efbaa54b965348b6) branch is quite suitable for such a check, even without an experimental a8b6a30a2450747272dca2b7c49bd3ca9b9c607d commit.
In addition, the "Spilling with LRU policy" feature for https://github.com/erthink/libmdbx/issues/186 will be ready soon (probably today). So reasonable you should wait and check out the latest version.operation not allowed
error. Otherwise I suggest create an separate issue for it.The not permitted
error seems to happen because the corruption was detected:
INFO [04-26|13:08:35.248] Build info git_branch=mdbx_debug git_commit=790ee7663f80d2b4c52e46bfccb8617b8a57c28a
INFO [04-26|13:08:35.248] Starting Turbo-Geth on Ethereum mainnet...
INFO [04-26|13:08:35.249] Maximum peer count ETH=50 total=50
INFO [04-26|13:08:35.249] Set global gas cap cap=25000000
INFO [04-26|13:08:35.288] Opening Database (MDBX) mapSize="0 B"
mdbx_set_readahead:8674 readahead OFF 0..273096532
badpage: corrupted branch-page #2326087, mod-txnid 81721
badpage: invalid page txnid (81721) for parent-page' txnid (81719)
panic: create bucket 2: snap, mdbx_dbi_open: operation not permitted
Should I abandon this DB file and redo the sync?
I have started the new sync with the version #define MDBX_BUILD_SOURCERY bbcb0fd59ce19d2a8624d61216a54afc8ff2e62bc38e846a0ea2f96f3999bda5_v0_9_3_150_g31cfce4c
If you think this should not contain the problem, please close the issue
The not permitted error seems to happen because the corruption was detected:
So there is a minor issue of losing the original error code. I think I'll fix this today.
Should I abandon this DB file and redo the sync?
Most likely, the error is only in the incorrect mod-txnid value(s) and there is actually no other damage in the database. So there are three options:
mdbx_chk
, and then do something with this DB by a hacked version of software (i.e. make dump by mdbx_dump
and then restore by non-hacked mdbx_load
).I have started the new sync with the version
#define MDBX_BUILD_SOURCERY bbcb0fd59ce19d2a8624d61216a54afc8ff2e62bc38e846a0ea2f96f3999bda5_v0_9_3_150_g31cfce4c
If you think this should not contain the problem, please close the issue
I suggest close if there are no problems with the current version.
Looks like it works now, closing
I have performed the full sync of turbo-geth node with the new version of MDBX (after ReadAhead issue was fixed), the sync started on 18th of April 2021. I have been running it until 21st of April and showdown gracefully. In our logs, these were the messages at shutdown:
MDBX version used for the above is
#define MDBX_BUILD_SOURCERY ae952107e9ef79b10938353246db3c1ad3472520ce329559e432faf866794c89_v0_9_3_102_gb39cdcab
Then, I tried to start using the same database again, and it reported corruption:
MDBX version used for the above is
#define MDBX_BUILD_SOURCERY 8d84a8c94bf33a8a4b959516546f5a33458865e5e2d0f744710c43cf76e37cbd_v0_9_3_125_g2c717590
As @AskAlexSharov suggested, I turned on these extra debug things:
and tried again, and here is what I got:
For the step above, the version of MDBX was
#define MDBX_BUILD_SOURCERY 8d84a8c94bf33a8a4b959516546f5a33458865e5e2d0f744710c43cf76e37cbd_v0_9_3_125_g2c717590
Also, as @AskAlexSharov suggested, I ran
mdbx_chk
on the database, and this is the output I got:For the step above, version of MDBX was
#define MDBX_BUILD_SOURCERY 8d84a8c94bf33a8a4b959516546f5a33458865e5e2d0f744710c43cf76e37cbd_v0_9_3_125_g2c717590
I can keep the database file around for further investigations, if required
Thanks a lot in advance