Closed vweevers closed 6 years ago
Should also allow empty range options (zero-length string or zero-length Buffer) in _setupIteratorOptions
.
As @dominictarr wrote in https://github.com/Level/encoding-down/issues/42:
[abstract-leveldown now] strips out nullish ranges, which is reasonable for leveldown, but was supported in levelup, and null, undefined and '' are significant values in some encodings (bytewise, charwise).
Empty range options should be OK for leveldown, because IIRC it ignores such options in the C++.
Problem: in encoding-down
, range options must be serialized after encoding them.
Problem: in encoding-down, range options must be serialized after encoding them.
No :) The order doesn't matter in encoding-down
, as its _serializeKey
is an identity function.
@ralphtheninja WDYT about:
_serializeKey
?_serializeValue
?Note that before this PR (specifically, dec6278), keys that would likely serialize to an empty string were rejected:
And this String(obj)
call was kinda wasteful.
I'm leaning towards a yes, because we gotta make sure we're not passing empty string keys to leveldown, which segfaults on empty slices, or at least it used to. @ralphtheninja is that still the case, or does the C++ guard against that nowadays?
is that still the case, or does the C++ guard against that nowadays?
Hmm, I would have to check.
Might be moot. I'm thinking if leveldown doesn't support empty keys, other implementations shouldn't either for consistency. So it's something that abstract-leveldown
should guard against.
var leveldown = require('.')
var db = leveldown('./db')
db.open(function (err) {
db.put('', 'VALUE', function (err) {
console.log('after put', err)
db.get('', function (err, value) {
console.log('after get', err, value.toString())
})
})
})
If I comment out _checkKey
in abstract-leveldown
I get:
after put undefined
after get null VALUE
@ralphtheninja huh, I did not expect that!
Ah, the segfaults were not about empty keys, but empty range options: https://github.com/Level/levelup/issues/112.
Still, we never supported empty keys, AFAICS. The check was introduced here: https://github.com/Level/abstract-leveldown/commit/4015c7b8707783a8497bd8336ee8bbaf16f0e85d, moved to prototype here: https://github.com/Level/abstract-leveldown/commit/da0ef1310ab0d43edbbb9bfa0f425a2c8afc4af9, then stayed the same until today.
We have three options, in pseudocode:
serialize(check(key))
- this doesn't fully guard against empty keys, might be acceptable thoughcheck(serialize(key))
- this can't block types that we want to be illegal across the boardcheck(serialize(check(key)))
- overkill?Oh, how I hate this.
I made up my mind. I want to keep the current logic - serialize(check(key))
- because serialization is meant to convert a key to a type supported by the underlying storage. That can be any type, we shouldn't care. We check user input, yes, but once we pass data to be serialized, it's out of our hands.
What matters more, is how the data comes out of the storage - i.e. it shouldn't be nullish because of preexisting significance in iterators and streams. We don't have a runtime guard against that atm, or a good mechanism in place to support such a guard. Adding more input checks certainly doesn't help, and we can't write abstract tests for it because we can't make assumptions about serialization or type support. So... that's something to attack later (or not).
For now, rejecting nullish as input should be enough of a signal to consumers and implementors. In addition, I'd like to keep rejecting empty keys, simply because that's how it's always been.
The only gap left is that before this PR, abstract-leveldown
rejected String(key) === ''
. This is a real edge case though, because all JS types for which String(key) === ''
are already rejected: strings, arrays and buffers. The only type that can pass through is an object with a custom toString()
that returns ''
.
Tested against leveldown#next
with one addition:
LevelDOWN.prototype._serializeKey =
LevelDOWN.prototype._serializeValue = function (datum) {
return Buffer.isBuffer(datum) ? datum : String(datum)
}
After this I think we should add next
branches to memdown
and level-js
.
After this I think we should add next branches to memdown and level-js.
And encoding-down
(where we should be able to remove _setupIteratorOptions
).
And encoding-down (where we should be able to remove _setupIteratorOptions).
Good. This will make @dominictarr happy!
Was nice to finally get this merged!
HAHAHAHA there we have it. A perfect gif. I think we should use it in the release notes!
Closes #120, closes #130, closes #230. Checklist adapted from https://github.com/Level/abstract-leveldown/issues/120#issuecomment-390171073:
abstract#_setupIteratorOptions
(no need for #222)abstract#_setupIteratorOptions
.TBD_serializeKey
to an identity function_serializeValue
to an identity functionReject nullish keys before calling(Already the case). The benefit is that_serializeKey
._serializeKey
can be used on range options with nullish support._serializeValue
Reject nullish and empty keys after calling_serializeKey
? TBDReject nullish values after calling_serializeValue
? TBD_serializeKey
(#130)These are very tricky changes, will need canary testing in at least
leveldown
.Externally (just mental notes):
leveldown#_serializeKey
and#_serializeValue
(same logic) to convert nullish to an empty string, Buffer as-is, everything else to a string.