andymatuschak / react-native-leveldown

Native bindings to LevelDB for React Native, exposed as leveldown-compatible interface
MIT License
22 stars 6 forks source link

strings are incorrectly null-terminated #6

Closed savv closed 3 years ago

savv commented 3 years ago

The following snippet truncates everything after 0:

          await db.put('key', Buffer.from([1, 2, 3, 0, 4, 5, 6]));
          const res = await db.get('key');

I created a demo of this, here: https://github.com/savv/react-native-leveldown/commit/0ffb1d43ec0305ecb10e4cc2fcc967eb41005cb9

Removing the 0 returns the whole buffer (incl. 4 5 6). This is an issue on android and ios.

image

andymatuschak commented 3 years ago

Yikes. Right. I suppose this is occurring because we encode/decode the buffers to strings to marshal them over the FFI to the native modules. Unfortunately, React Native doesn't support marshaling any binary-like types (Buffer, Uint8Array, etc): https://github.com/facebook/react-native/issues/1424

One workaround would be to encode Buffers to base64 for marshaling, but that's an awfully painful cost. Hm hm…

andymatuschak commented 3 years ago

Actually, I suppose that encoding/decoding a binary buffer to UTF-8 strings (what we're doing now) shouldn't be that much faster than encoding/decoding to base-64. They're both taking an unfortunate hit iterating and copying the bytes in the buffer; the main difference is that the latter will require some more bitshifting and will produce a larger output. They're both awful compared to just marshaling the buffer directly, and the difference in awfulness is probably not significant enough to matter.

@savv: I'm afraid I'm not storing binary data in my databases in my current project, so I'm unlikely to be able to prioritize fixing this soon. If you'd be interested in preparing a pull request which serializes buffers keys/values to base-64 instead of UTF-8, I'd be happy to advise, review, and merge. If you're not interested, I'll add a warning to the Readme. I'm surprised the abstract-leveldown tests don't cover this, actually! That's probably worth an issue too.

andymatuschak commented 3 years ago

(alternately, this could perhaps be resolved even better through a TurboModules implementation, as you suggested elsewhere, but that would of course be a much heavier lift!)

lachenmayer commented 3 years ago

I've similarly found that buffers in general do not work as expected, we encountered various issues a while back. We are so far working around that by just using string keys & values. This works for now, but I would definitely be keen to figure out what it would take to get a TurboModules-based implementation working. Have either of you ever tried it for anything?

savv commented 3 years ago

Hi @lachenmayer, I'm on it (but using JSI). I currently need to add ArrayBuffers to hermes / jsc (see https://github.com/facebook/hermes/pull/419).

savv commented 3 years ago

Hi, exciting news!

You can find the new library here: https://yarnpkg.com/package/react-native-leveldb and here: https://github.com/greentriangle/react-native-leveldb

It directly imports leveldb from github (=> latest version, unlike android!) and creates C++ bindings which can then be called synchronously from javascript.

I think it makes sense to have this as a separate library, since I was unsure if leveldown is the right API for dealing with binary data. That said, it would perhaps be beneficial to use react-native-leveldb from this library, as it'll make it faster due to using JSI. I'm happy to help from my side, if I need to add something.

lachenmayer commented 3 years ago

Hey @savv, just seeing this - super exciting, what a great early Christmas gift :) I'll definitely try swapping out implementations at some point in the next couple of weeks.

Just had a brief glance through the code, really cool that the same C++ code works on Android & iOS with so little platform-specific boilerplate! Am going to look into JSI more, have found very little documentation/info on it so far. Do you by any chance know what the most up-to-date resources are for the new RN architecture stuff?

Re. leveldown, I think it wouldn't be too hard to wrap react-native-leveldb in this library. I think it still makes sense for react-native-leveldown to exist, as exposing the leveldown API will allow seamless interop with a bunch of other libraries in the Node.js "level" ecosystem, eg. encodings & query engines. Binary keys & values (usually as Node.js Buffers) are a fully supported feature, which can be enabled using the bufferKeys and bufferValues flags to the abstract-leveldown constructor (see https://github.com/andymatuschak/react-native-leveldown/blob/master/index.ts#L132).

On the other hand it seems like abstract-leveldown assumes some Node.js-isms that don't work on React Native, like process.nextTick which I had to manually set to process.nextTick = setImmediate to make it work, so maybe it's not the best approach after all.

Either way, awesome work, thanks so much for this. Hope you have great holidays.

savv commented 3 years ago

Hi @lachenmayer , I'm excited to hear about your experience! I have done some amount of testing on it, but I expect that there will be rough edges to iron out. Please feel free to file issues, and I will do my best to resolve them.

resources are for the new RN architecture stuff

I haven't found a lot of resources, a lot of it was trial and error and reading the source. I'm happy to help with JSI where I can, I'll reach out to you directly via email. In addition to JSI, there's also TurboModules. I think the vision is that TurboModules will automate some parts of creating interfaces between JS & C++, but haven't delved into it.

I think it still makes sense for react-native-leveldown to exist

I think so too, and I'm happy to help make this happen.

andymatuschak commented 3 years ago

This is wonderful! I do like the leveldown API and ecosystem, so I'd love to make the plan of record that we'll turn this package into a wrapper for yours. I'm unlikely to get to that in the next month or two; pull requests are welcome.

andymatuschak commented 3 years ago

One key issue with abstract-leveldown: it mandates async callbacks. On a whim, I modified the benchmark to async/await each call, and this degraded performance by roughly 20%. Significant!

Writes wouldn't be this bad in practice: rather than doing individual puts, you'd use the batch API. But the iterator API is inescapably individual.

In my use case, this kind of performance hit wouldn't be enough to matter, and I'd generally want to smear this work across many ticks to avoid blocking interactions on the main JS thread. But it's an overhead worth noting!

craftzdog commented 3 years ago

In react-native-sqlite-2, it escapes null characters to avoid this issue like so:

function escapeBlob(data) {
  if (typeof data === 'string') {
    return data
      .replace(/\u0002/g, '\u0002\u0002')
      .replace(/\u0001/g, '\u0001\u0002')
      .replace(/\u0000/g, '\u0001\u0001')
  } else {
    return data
  }
}

function unescapeBlob(data) {
  if (typeof data === 'string') {
    return data
      .replace(/\u0001\u0001/g, '\u0000')
      .replace(/\u0001\u0002/g, '\u0001')
      .replace(/\u0002\u0002/g, '\u0002')
  } else {
    return data
  }
}
andymatuschak commented 3 years ago

Closing this in favor of #7, which we'll call the plan of record.