Open janekolszak opened 1 year ago
In my tests, I was able to get values up to 2gb. What error are you seeing?
I'm seeing:
<--- Last few GCs --->
[1104752:0x5672330] 6147 ms: Scavenge 299.4 (333.2) -> 299.4 (333.2) MB, 34.5 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure
[1104752:0x5672330] 7458 ms: Scavenge 491.4 (525.2) -> 491.4 (525.2) MB, 65.3 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure
[1104752:0x5672330] 10035 ms: Scavenge 875.4 (909.2) -> 875.4 (909.2) MB, 128.5 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure
<--- JS stacktrace --->
FATAL ERROR: invalid array length Allocation failed - JavaScript heap out of memory
1: 0xa04200 node::Abort() [node]
2: 0x94e4e9 node::FatalError(char const*, char const*) [node]
3: 0xb797be v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
4: 0xb79b37 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
5: 0xd343c5 [node]
6: 0xd0cf05 [node]
7: 0xe962ae [node]
8: 0xe9b9f4 [node]
9: 0xe9bcb8 [node]
10: 0xeef18b v8::internal::JSObject::AddDataElement(v8::internal::Handle<v8::internal::JSObject>, unsigned int, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes) [node]
11: 0xf43c92 v8::internal::Object::AddDataProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::Maybe<v8::internal::ShouldThrow>, v8::internal::StoreOrigin) [node]
12: 0xf46f8f v8::internal::Object::SetProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::StoreOrigin, v8::Maybe<v8::internal::ShouldThrow>) [node]
13: 0x10709c5 v8::internal::Runtime::SetObjectProperty(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::StoreOrigin,
v8::Maybe<v8::internal::ShouldThrow>) [node]
14: 0xdcfb6a v8::internal::Runtime_KeyedStoreIC_Slow(int, unsigned long*, v8::internal::Isolate*) [node]
15: 0x14011f9 [node]
This is a small demo (it fails on Ubuntu 20, lmdb-js v2.7.3) with a segfault.
head -c 333MB /dev/urandom > bigfile.bin
/* eslint-disable */
import { open } from 'lmdb';
import * as fs from 'fs';
async function main() {
const db = open<any, string>({
path: ./test-cache
,
});
const big = fs.readFileSync("./bigfile.bin")
await db.put("id", big.toString())
console.log("put() success")
let loaded = await db.getBinary("id")
console.log("getBinary() success")
loaded = await db.getBinaryFast("id")
console.log("getBinaryFast() success")
loaded = await db.get("id")
console.log("get() success")
await db.close()
console.log("OK")
}
main().catch((e) => console.error(e));
If you create a smaller file and run it it prints OK:
`head -c 1MB /dev/urandom > bigfile.bin`
I am kind of wondering if this is a V8 bug. I can actually trigger an error without lmdb at all by doing this with your code:
const big = fs.readFileSync("./bigfile.bin")
let str = big.toString();
let d = new TextEncoder().encode(str);
let s = (new TextDecoder()).decode(d);
console.log(s.length)
I think this error might be occurring in the msgpackr's native decoder and might not be properly handled there, but even if it was, V8 doesn't seem to be capable of decoding this string.
Maybe there should be an option for getRange()
to use something like getBinary()
?
Yes, @janekolszak, that seems like a reasonable option to add. I will try to get that in the next release.
Maybe lmdb-js decodes values into strings?
There seems to be a limit of 512MB for string size on 32b systems, but I see it on my 64b system with node 19.3.0 (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/length)
Your demo on node 18.12.1 fails with:
TypeError [ERR_ENCODING_INVALID_ENCODED_DATA]: The encoded data was not valid for encoding utf-8
at new NodeError (node:internal/errors:393:5)
at TextDecoder.decode (node:internal/encoding:433:15)
at Object.<anonymous> (/home/jan/work/lmdb/tools/big-file.ts:7:29)
at Module._compile (node:internal/modules/cjs/loader:1159:14)
at Module.m._compile (/home/jan/.nvm/versions/node/v18.12.1/lib/node_modules/ts-node/src/index.ts:1618:23)
at Module._extensions..js (node:internal/modules/cjs/loader:1213:10)
at Object.require.extensions.<computed> [as .ts] (/home/jan/.nvm/versions/node/v18.12.1/lib/node_modules/ts-node/src/index.ts:1621:12)
at Module.load (node:internal/modules/cjs/loader:1037:32)
at Function.Module._load (node:internal/modules/cjs/loader:878:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12) {
errno: 1,
code: 'ERR_ENCODING_INVALID_ENCODED_DATA'
}
Your demo on node 19.3.0 fails with:
Error: Cannot create a string longer than 0x1fffffe8 characters
at TextDecoder.decode (node:internal/encoding:428:16)
at Object.<anonymous> (/home/jan/work/lmdb/tools/big-file.ts:7:29)
at Module._compile (node:internal/modules/cjs/loader:1218:14)
at Module.m._compile (/home/jan/.nvm/versions/node/v19.3.0/lib/node_modules/ts-node/src/index.ts:1618:23)
at Module._extensions..js (node:internal/modules/cjs/loader:1272:10)
at Object.require.extensions.<computed> [as .ts] (/home/jan/.nvm/versions/node/v19.3.0/lib/node_modules/ts-node/src/index.ts:1621:12)
at Module.load (node:internal/modules/cjs/loader:1081:32)
at Function.Module._load (node:internal/modules/cjs/loader:922:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:82:12)
at phase4 (/home/jan/.nvm/versions/node/v19.3.0/lib/node_modules/ts-node/src/bin.ts:649:14) {
code: 'ERR_STRING_TOO_LONG'
}
My demo on node 18.12.1 fails with:
put() success
getBinary() success
getBinaryFast() success
TypeError: Cannot read properties of undefined (reading '0')
at readString (/home/jan/work/lmdb/node_modules/msgpackr/unpack.js:568:22)
at read (/home/jan/work/lmdb/node_modules/msgpackr/unpack.js:454:12)
at checkedRead (/home/jan/work/lmdb/node_modules/msgpackr/unpack.js:195:13)
at Packr.unpack (/home/jan/work/lmdb/node_modules/msgpackr/unpack.js:102:12)
at Packr.decode (/home/jan/work/lmdb/node_modules/msgpackr/unpack.js:174:15)
at LMDBStore.get (/home/jan/work/lmdb/node_modules/lmdb/read.js:230:70)
at main (/home/jan/work/lmdb/tools/big-file.ts:22:23)
My demo on node 19.3.0 fails with:
put() success
getBinary() success
getBinaryFast() success
[1] 51080 segmentation fault (core dumped) ts-node ./tools/big-file.ts
Your demo modified for Deno:
error: Uncaught (in promise) TypeError: Cannot allocate String: buffer exceeds maximum length.
at async Object.readTextFile (deno:runtime/js/40_read_file.js:56:20)
Maybe lmdb-js decodes values into strings?
lmdb-js (msgpackr) preserves the types of values, so if you encode a string, it will be decoded as a string. And you are explicitly converting your data to a string when it is stored/encoded (so lmdb-js decodes to a string to match what you requested/stored):
await db.put("id", big.toString())
(so if you don't want it decoded as a string, don't store it as a string, store it as a buffer/binary data)
There seems to be a limit of 512MB for string size on 32b systems, but I see it on my 64b system with node 19.3.0
It doesn't seem that surprising that V8 would change this without MDN being updated yet (maybe they felt it was better to be consistent so that there is no behavioral differences that can be observed/detected between architectures).
Hi! Is it expected that it's not possible to
get()
values bigger than 333MB? It is possible to fetch withgetBinary()
.Thank you!