Closed maxlath closed 2 years ago
Is that an expected amount of RAM for leveldb with the default options
No, but it depends on what you're doing. Do you have some code to share?
We have several modules making use of their own sub databases, see
get_sub_db.js
which returns a subleveldown
-wrapped db handleget_sub_db.js
to do their thing in different encoding, mostly json and utf8level-jobs
, a lib that was not updated in the last 4 years, but disabling the code relying on it (thus not even loading the module) did not seem to reduce the amount of RAM usedHave you ruled out cacheSize and maxOpenFiles?
yes, both options are disabled in the currently running process based on this branch
The rest of it is too much code to go over.
Any clue on what we can try to address this?
Use a memory profiler, starting with the JS side because that's easier.
yes, I tried to use heapdump
, but struggle to get any useful information out of the result. Thanks for the input, I'll keep digging
I wonder if this might be related to those issues with mmap in LevelDB repo https://github.com/google/leveldb/issues/860 https://github.com/google/leveldb/issues/866: would there be a way to customize mmap limit from leveldown? (I guess not, as the answer in the issue is to fork leveldb :/)
Edit: possibly similar issue https://github.com/Level/leveldown/issues/758
would there be a way to customize mmap limit from leveldown?
Not via JS, but if you just want to confirm that mmap is the issue, you could clone the leveldown repo, tweak relevant LevelDB code (which is vendored into leveldown), build leveldown and see if it makes any difference. I think the relevant code is:
I tried setting mmap_limit = 0
, but while it does successfully prevent to mmap files, the huge memory allocation is still there. So it seems I'm back to looking for clues.
Inspired by https://github.com/vectordotdev/vector/issues/7514, I went on to profile the memory allocation using valgrind (in my case, the command was valgrind --tool=massif node ./scripts/level_party_leader.js
). Here is the report data generated after a process holding between 2 and 4GB of RAM was shut down to let level_party_leader.js
take the lead (a process that just joins the level-party and does nothing else). The memory went to the roof (4GB) as it was, I guess, taking over the tasks started by the previous level-party leader(?). I then SIGINT it to get it to generate the massif data report:
massif.out.538201.inventaire.prod.zip
Giving that report to massif-visualizer, gives the following visualization: . If anyone has a clue what could possibly make those functions allocate so much memory, I'm interested :sweat_smile:
I'm delighted to announce that this issue seems solved :sweat_smile:
I went into details of the issue in the commit message here https://github.com/inventaire/inventaire/commit/4141906b7ea47fcbb6bf374104648706949b4db8 but here is a summary of what might be of interest for level-party:
An error in our code was making us build batch operations arrays with a big array mistakenly passed as an operation key. Somewhere between levelup batch
function and abstratct-leveldown batch
function, that array operation key was turned into a buffer, and the batch operations went on without throwing an error.
Is this an expected behavior to be able to pass an array as a batch operation key? If yes:
_serializeKey
from a level-party db object, to make it reject an array passed as a key?Minimum code to reproduce it:
const db = require('level-party')('/tmp/foo')
db.batch([
{
type: 'del',
// This key will be converted to "foo,bar", no error will be raised
// 'level' has the same behavior
key: [ 'foo', 'bar' ]
}
])
.then(res => console.log('success', res))
.catch(err => console.error('error', err))
Glad to hear you found the issue, congrats!
Is this an expected behavior to be able to pass an array as a batch operation key?
Sort of, yes. It depends on the encoding. In this case the key goes through the utf8
encoding which just does String(key)
.
In general, encodings don't validate input, in order to be fast and forgiving. They massage user input into a desired type. For example, db.get(Buffer.from('3'))
and db.get(3)
and db.get('3')
all do the same thing. I'm not sure to what extent that's by design (historically) but it's fitting for JavaScript.
You could opt-in to stricter type checking by using TypeScript.
Is it possible to customize _serializeKey?
Encodings are applied before _serializeKey()
so at that point your array key has already been converted to a string. Side note: the _serializeKey
method no longer exists in abstract-level
.
env
problem
Our level-party leader process holds up to 3GB of RAM, and refuses to release it once the activity peak passes, before being eventually killed by Linux OOM Killer (See https://github.com/inventaire/inventaire/issues/542)
It's actually probably not a
level-party
-specific issue, as trying to run with onlylevel
produces the same behavior, but I didn't know where to post: would that be aleveldown
issue? Is that an expected amount of RAM for leveldb with the default options? Any clue on what we can try to address this?Thanks for any help, I'm running short of ideas ^^'