Open CMCDragonkai opened 2 years ago
Note that node module resolution would load locally first before loading from anywhere else.
I'm not sure how this applies to non-node runtimes like in NativeScript. Recommend not doing anything here until we have understood the platform properly.
The usage of ArrayBuffer
would be a standardisation across other JS libraries and anything claiming to support ecmascript remember.
Note that all Node Buffer
is Uint8Array
which is ArrayBuffer
.
If something supports ArrayBuffer
, they would support Uint8Array
and Buffer
at the same time.
We are also using alot of Uint8Array
where Buffer
was normally used. But ArrayBuffer
is always useful from a "storage" POV. And returning Uint8Array
means you always returning something that can be ArrayBuffer
too either by using it directly or with bytes.buffer
.
Therefore we can be satisfied with supporting Uint8Array
as well. But maximum compatibility is going to be the ArrayBuffer
.
The type we use in js-db
is LevelDB<string | Buffer, Buffer>
. This means the values returned satisfy the interface of Buffer
, Uint8Array
and ArrayBuffer
all together.
As for string | Buffer
in src/DB.ts
, if it works just by changing Buffer
to ArrayBuffer
, then this would solve this issue, and one could pass Buffer
, ArrayBuffer
or Uint8Array
as the key.
The only issue would be the domainPath
that uses prefixer
. Does that work with just ArrayBuffer
, or would that need Uint8Array
?
No it doesn't because it uses Buffer.isBuffer
to check. It throws this error Error: key should be a string or a buffer
if it is not Buffer
.
However one could wrap it in our domainPath
function. We should be able to zero-copy it into a Node Buffer
.
The domainPath
can be resolved by this change:
function domainPath(levels: DBDomain, key: string | ArrayBuffer): string | Buffer {
// prefixer only takes string or Node Buffer
// convert key into key_ which wraps ArrayBuffer as Buffer
let key_: string | Buffer;
if (typeof key === 'string') {
key_ = key;
} else if (ArrayBuffer.isView(key)) {
key_ = Buffer.from(key.buffer, key.byteOffset, key.byteLength);
} else {
key_ = Buffer.from(key);
}
if (!levels.length) {
return key_;
}
let prefix = key_;
for (let i = levels.length - 1; i >= 0; i--) {
prefix = prefixer(levels[i], prefix);
}
return prefix;
}
However this doesn't fix everything. We also use Buffer
for this.crypto.key
and there are complications with when using worker manager.
When using worker manager, we want to copy the ArrayBuffer
, but if the actual instance is Node Buffer
, the slice
call creates a view, not a copy. So we have to distinguish by checking with Buffer.is
and ArrayBuffer.isView
and doing a set
or copy
or slice
call. Ensuring that we are in fact copying the ArrayBuffer
.
But yes this issue looks like it is totally possible.
Note that this would only present an ArrayBuffer
interface to the outside world. Internally Node Buffer
would still be used by the domainPath
. And of course LevelDB still returns Buffer
. So primarily this change would be about increasing the compatibility of DB as we progressively move towards more usage of Uint8Array
and ArrayBuffer
such as js-id
and webcrypto.
The TextDecoder
works on ArrayBuffer
and Uint8Array
.
[nix-shell:~/Projects/js-db]$ node
Welcome to Node.js v14.17.3.
Type ".help" for more information.
> u = new Uint8Array([0x61, 0x62, 0x63])
Uint8Array(3) [ 97, 98, 99 ]
> u.buffer
ArrayBuffer { [Uint8Contents]: <61 62 63>, byteLength: 3 }
> new TextDecoder().decode(u.buffer)
'abc'
> new TextDecoder().decode(u)
'abc'
Need to ensure that this doesn't result in a performance regression with respect to benchmarks.
https://github.com/Level/subleveldown/issues/111 - seems like we should be able to use Buffer
, further checking required.
This is somewhat more complicated now that we have alot of functions relying on the Buffer
interface.
Applications that use this library right now has to use the feross buffer package if their runtime environment doesn't have the Buffer
class.
Having an ArrayBuffer
compatible system should make this alot more compatible, however unless this package is made generic over the underlying DB, that is browser indexeddb or node leveldb, then doing this doesn't really change the portability alot. However it does enable the usage of our system to be bit easier, so that when downstream libraries like PK uses ArrayBuffer
, they don't need to convert it to a Buffer
just so that it can be work inside the DB.
Although the new abstract-level has support for Uint8Array, so that may be a better way to support it https://github.com/Level/browser-level/blob/98c13d4356fdb6ca5668ddd35859e6c9b326b5cb/UPGRADING.md#uint8array-support rather than directly supporting ArrayBuffer
.
We are no longer attached to abstract-level
anymore. We are directly using rocksdb
. As it forces the NAPI API, we might as well stick with buffer. Any need to use ArrayBuffer
should just use the feross buffer.
The js-db
will likely have native mobile bindings later, in which case even there I'd say we should continue using feross buffer. The ArrayBuffer
is just quite low level.
One way to quickly solve this issue is to just take ArrayBuffer
type, and if it is an array buffer, wrap it up as a Buffer
automatically. That way it avoids users having to wrap it themselves. And opens us up to directly using ArrayBuffer
in the future.
If we receive an ArrayBuffer
. All we need to do is Buffer.from(arrayBuffer)
. That's it. That will be zero-copy and just wrap the buffer.
If we use bufferWrap
as in Polykey, then we can also support Uint8Array
.
So let's just do that. Here's the bufferWrap
function.
/**
* Zero-copy wraps ArrayBuffer-like objects into Buffer
* This supports ArrayBuffer, TypedArrays and the NodeJS Buffer
*/
function bufferWrap(
array: BufferSource,
offset?: number,
length?: number,
): Buffer {
if (Buffer.isBuffer(array)) {
return array;
} else if (ArrayBuffer.isView(array)) {
return Buffer.from(
array.buffer,
offset ?? array.byteOffset,
length ?? array.byteLength,
);
} else {
return Buffer.from(array, offset, length);
}
}
This means when we are passing Ids into the DB, we no longer have to do const idBuffer = id.toBuffer();
.
@amydevs Since in js-db we are planning to continue using Buffer
, you can use it in js-ws
as well, and just wrap your ArrayBuffer
as a regular buffer.
Do this: https://github.com/MatrixAI/js-db/issues/3#issuecomment-1280289006 when you check instanceof ArrayBuffer
.
Use the bufferWrap
:
function bufferWrap(
array: BufferSource,
offset?: number,
length?: number,
): Buffer {
if (Buffer.isBuffer(array)) {
return array;
} else if (ArrayBuffer.isView(array)) {
return Buffer.from(
array.buffer,
offset ?? array.byteOffset,
length ?? array.byteLength,
);
} else {
return Buffer.from(array, offset, length);
}
}
Now the issue is that Buffer
is not a regular type in browsers. But we deal with this by introducing feross/buffer
as the polyfill.
However this should only be done in the application context. So either the application remembers to bundle up with Buffer
when using such libraries. OR... the package.json
could specify feross/buffer
as some sort of polyfill-dependency? Not sure, could use peerDependencies
.
Actually for js-ws
it makes more sense to always convert to Uint8Array
and not deal with node buffers.
So that means if it is an ArrayBuffer
, you can do new Uint8Array(ab)
. If it is node buffer... it is already a Uint8Array
.
Specification
The
ArrayBuffer
type is more generalised in the JS ecosystem. The DB right now focuses on taking Node Buffers.Node buffers can be converted to
ArrayBuffer
easily, andArrayBuffer
can be wrapped in Node buffers.According to this comment: https://github.com/Level/level-js/issues/34#issuecomment-397811192 it is possible to use ArrayBuffers directly in leveldb, it just not documented.
It says that the
asBuffer
has to be turned false.I'm not sure how this would work with
keyEncoding
andvalueEncoding
that we have set to'binary'
.Our
raw
is also specifying that we would getBuffer
, would that mean we instead say that we returnArrayBuffer
instead, and with keys it would bestring | ArrayBuffer
?Furthermore this would impact js-encryptedfs, so it's worthwhile to explore the implications of this.
The primary reason to do this would be cross-platform compatibility for mobile platforms that may not have the Node buffer.
The alternative would be use https://github.com/feross/buffer everywhere as a dependency so that way everything just uses the
buffer
library. This will mean that any usage ofimport { Buffer } from 'buffer';
will be resolved by feross/buffer first though, so one should beware of that.Note that all Node
Buffer
isUint8Array
which isArrayBuffer
.If something supports
ArrayBuffer
, they would supportUint8Array
andBuffer
at the same time.One benefit would be integration of js-id https://github.com/MatrixAI/js-id/issues/1 can be simplified since
Id
asUint8Array
can be stored directly in the DB without first wrapping them asBuffer
.Additional context
Tasks
ArrayBuffer
compatibility in leveldbArrayBuffer
in DB without losingBuffer
supportstring | Buffer
withstring | ArrayBuffer
for keys, but retainBuffer
on returned output.domainPath
with https://github.com/MatrixAI/js-db/issues/3#issuecomment-946416685this.crypto.key
to be assumed to beArrayBuffer
toArrayBuffer
to an explicit slice copy of all possible instances ofArrayBuffer
typefromArrayBuffer
is still useful since we returnBuffer
.serialize
anddeserialize
to useArrayBuffer
andTextEncoder
andTextDecoder
which can do theutf-8
encoding/decoding.deserialize
andserialize
might need to change.Uint8Array
andArrayBuffer
usage, and ensure that slice-copying is working.