Level / abstract-level

Abstract class for a lexicographically sorted key-value database.
MIT License
128 stars 8 forks source link

Confused about encoding #35

Closed ccollie closed 2 years ago

ccollie commented 2 years ago

i'm resolving failing tests for cacache-level and have a question about encodings.

As I understand it, when I declare valueEncoding and keyEncoding in my constructor, abstract-level makes sure that the values passed to my implementation are properly converted in and out of storage.

According to test failures, it seems that this applies to set methods and not get. IOW, values sent to the database are encoded properly, but are not decoded for the consumer. Either that or the tests themselves do not account for encodings.

As a concrete example, test simple put() always fails, because I specify valueEncoding as buffer (which the underlying lib returns), but the test compares against a plain string value without decoding. In addition, the encoding options sent to _get are exactly as set in the constructor, so there's no work to be done as far as the implementer can tell.

Any thoughts ?

vweevers commented 2 years ago

This is happening because of https://github.com/ccollie/cacache-level/blob/7a7aed5758dcbc566f196c7e9ee4e04bd4a61df1/index.js#L434-L438:

super({
  encodings: {
    buffer: true,
    utf8: true,
    view: true
  }
}, opts)

This tells the AbstractLevel constructor that your private API accepts input that is a buffer, string or view, and that it returns appropriate output types based on private options. Meaning, your implementation takes over some of the encoding responsibilities. You're then expected to handle all of these cases:

db._get(key, { keyEncoding: 'utf8', valueEncoding: 'utf8' }, callback) // Should yield string
db._get(key, { keyEncoding: 'buffer', valueEncoding: 'buffer' }, callback) // Should yield buffer
db._get(key, { keyEncoding: 'view', valueEncoding: 'view' }, callback) // Should yield Uint8Array
db._get(key, { keyEncoding: 'utf8', valueEncoding: 'buffer' }, callback) // And more permutations

What you probably want instead, assuming that cacache works with buffers:

super({
  encodings: {
    buffer: true
  }
}, opts)

Then you're only expected to handle the following case (and abstract-level will handle encoding from/to buffers):

db._get(key, { keyEncoding: 'buffer', valueEncoding: 'buffer' }, callback) // Should yield buffer

The second issue is that you're currently always overriding the keyEncoding and valueEncoding options (here), which are instead meant to be set by consumers (including the test suite).

vweevers commented 2 years ago

Note, atm there's unfortunately no way to specify that your private API wants string keys but buffer values; the encodings option applies to both keys and values.

vweevers commented 2 years ago

Closing because I assume the question has been answered.