Closed falcon027 closed 7 months ago
Do the errors above occur in that order (the JS error and then the abort())? Does this occur on startup or after some usage? Does this occur with v2.9?
Thanks for getting back to me, it does happen randomly after using it for a while. I have not tested it with v.2.9 yet, I will try and get back to you.
I have also published a v3.0.1 with increased debugging information for this error, if you can get a chance to try that.
I just tested it with version 3.0.1. And I get the following error message:
Freelist record not found 1 5 1 1 1 68 0 0../dependencies/lmdb/libraries/liblmdb/mdb.c:4330: Assertion 'mop_len == 0' failed in mdb_freelist_save()
zsh: abort NODE_ENV="production" node masterProcess.js
I put in an (attempted) fix for this in v3.0.2, let me know if it addresses the issue or if there is still an error what the message is (added more debugging).
Thanks for your quick debugging work. On version 3.0.2 I still get the following error:
Freelist record not found 1 5 1 1 1 8 2147483647 2147483647 0 0 0 0 0
../dependencies/lmdb/libraries/liblmdb/mdb.c:4341: Assertion 'mop_len == 0' failed in mdb_freelist_save()
zsh: abort NODE_ENV="production" node masterProcess.js
Ok, I have changed that assertion, as it appears this is a valid state to me. Let me know if you run into this or a related issue with v3.0.3.
This exact error no longer occurs, but there are new ones, see below:
Write error: Invalid argument position 163840, size 4096mdb_page_spill flushing error 22
mdb_page_spill error 22
and
Error: Invalid argument: Attempting to write page at position 163840, size 4096
at Function.<anonymous> (file:///Users/falcon/Projects/express-function/node_modules/.pnpm/lmdb@3.0.3/node_modules/lmdb/write.js:413:7) {
code: 22
}
I have published a v3.0.3-debug.1 with more debugging information.
However, do you have any example/test case that can reproduce this? I have never been able to reproduce any of these errors, with any of our tests, including with your db config above.
I have run your newly created debugging version and I get the following error:
Error: Invalid argument: Attempting to write page at position 163840, size 4096, blocks 1, buffer sizes 4096 4096 4096
at Function.<anonymous> (file:///Users/falcon/express-function/node_modules/.pnpm/lmdb@3.0.3-debug.1/node_modules/lmdb/write.js:413:7) {
code: 22
}
I will try to narrow down the problem and send you a setup to reproduce the error.
I was able to pinpoint the problem and create a repo to reproduce the bug. Please let me know if you can reproduce the bug on your system or if it might be platform specific.
The database file seems to be corrupted in some way.
Yes, I can reproduce this with your database. And indeed there is a incorrect/corrupted free space entry that is causing the problem. Do you have any suggestions/reproduction for how to get the database into this state? I haven't been able to reproduce that. I could add some code to tolerate this condition, but certainly would prefer to find the cause.
I recorded and replayed the database inserts prior to the curpution, but was unable to reproduce the condition outside of my application. I will keep trying and get back to you with the results, if you have any ideas let me know.
I was able to replicate the bug in an isolated environment. The repository has been updated. If you run the runMultiple.js file and wait for approximately 2 minutes, you will encounter the error. You may need to restart the program a few times to reproduce the bug. Please let me know if the bug is reproducible on your system.
No, I haven't been able to reproduce this. I have run this for multiple hours on my linux and mac computers with various restarts and haven't had any luck in reproducing. Will keep trying though.
That's not good, maybe it's something specific to my setup, I have an M1 Mac. See hardware info below:
Software:
System Software Overview:
System Version: macOS 14.4.1 (23E224)
Kernel Version: Darwin 23.4.0
Boot Volume: Macintosh HD
Boot Mode: Normal
Secure Virtual Memory: Enabled
System Integrity Protection: Enabled
Time since boot: 18 minutes, 9 seconds
Hardware:
Hardware Overview:
Model Name: MacBook Pro
Model Identifier: MacBookPro18,3
Chip: Apple M1 Pro
Total Number of Cores: 10 (8 performance and 2 efficiency)
Memory: 32 GB
System Firmware Version: 10151.101.3
OS Loader Version: 10151.101.3
Activation Lock Status: Enabled
I also put the logs of a runMultible.js execution into the repo maybe you can see something from the logs.
Thank you for information. Perhaps this is an ARM-specific issue, I will try an AWS instance and see if I can reproduce next.
So far, no luck with a graviton, after a few hours of testing. Will see if I can get a mac instance started.
OK, this is strange, when I run the runMultible.js file the error appears within a minute. I also tried it with version 2.9.4 and got the following error:
Node.js v20.10.0
[error] file:///Users/falcon/Projects/express-function/node_modules/msgpackr/unpack.js:497
let error = new Error('Unexpected end of MessagePack data')
Error: Unexpected end of MessagePack data
at read (file:///Users/falcon/Projects/express-function/node_modules/msgpackr/unpack.js:497:18)
at Array.readObject [as read] (file:///Users/falcon/Projects/express-function/node_modules/msgpackr/unpack.js:522:18)
at read (file:///Users/falcon/Projects/express-function/node_modules/msgpackr/unpack.js:260:23)
at checkedRead (file:///Users/falcon/Projects/express-function/node_modules/msgpackr/unpack.js:197:13)
at Packr.unpack (file:///Users/falcon/Projects/express-function/node_modules/msgpackr/unpack.js:104:12)
at Packr.decode (file:///Users/falcon/Projects/express-function/node_modules/msgpackr/unpack.js:176:15)
at Object.next (file:///Users/falcon/Projects/express-function/node_modules/lmdb/read.js:552:31)
at Object.next (file:///Users/falcon/Projects/express-function/node_modules/lmdb/util/RangeIterable.js:32:35)
at next (file:///Users/falcon/Projects/express-function/node_modules/lmdb/util/RangeIterable.js:261:24) {
incomplete: true
}
Sorry this took a while, I realized I was being stupid and running the wrong script. Should be fixed in v3.0.6.
I tested it on my machine and it works perfectly. Thank you very much. ( I sent a small token of appreciation to your email.)
I have a problem that causes my application to crash. And I have no idea what it means or how to fix it. Does anyone have any idea how I can prevent this?
My config
Thanks in advance Falcon.