Open pwnall opened 4 years ago
I suspect that the right thing to do when a Comparator returns an error is for LevelDB to stop whatever it's doing, return failure, and mark the database corrupted. We rely on tables being sorted, and corruption may break that fundamental invariant so, for example, MergingIterator
would have no good way to proceed. For extra credit, we could have some iterators that skip over corrupted keys, for use in recovery.
As far as extending the API, here's a sketch with bad names.
class LEVELDB_EXPORT Comparator {
public:
enum class Ordering {
kLess = -1,
kEqual = 0,
kGreater = 1,
kBadInput = 2,
}
virtual ~Comparator();
// Three-way comparison. Deprecated. Extend Compare2 instead.
virtual int Compare(const Slice& a, const Slice& b) const = 0;
// Key comparison.
//
// Returns Ordering::kBadInput if at least of the inputs is not a valid key.
// ...
virtual Ordering Compare2(Slice a, Slice b) const {
// Implements Compare2() in terms of Compare(). This implementation
// is used by legacy Comparators that don't override Compare2().
int legacy_ordering = Compare(a, b);
if (legacy_ordering == 0)
return Ordering::kEqual;
if (legacy_ordering < 0)
return Ordering::kLess;
return Ordering::kGreater;
}
// Override if Compare2() may return Ordering::kBadInput.
virtual bool IsValid(Slice a) const { return true; }
// TODO: Think about FindShortest*().
}
IsValid()
would be used by iterators that skip over corrupted keys.
I am a little leery of adding Compare2 since it is somewhat complicated. And I don't think we need it if we add IsValid. We could instead just give guidance to the user that leveldb will not call Comparator::Compare() if IsValid returns false.
Though if we are considering such validation, we should consider a more general validation approach. E.g., we add a Validator interface and allow an optional "Validator*" to be passed in Options. Validator could contain a method like: bool IsValidKey(Slice); but could at some point be extended with something like the following in the future that can validate values if we wish: bool IsValidEntry(Slice key, Slice value);
@ghemawat Thank you for looking into this!
Great point about validating values!
I don't have an opinion on that because values are opaque blobs for the database, and Chrome has layered its own corruption checks on top of all storage systems. I'm happy to do what you think is right here.
Having a separate validator is more elegant than my proposal. However, it does incur some overhead (worst case, the key would get de-serialized twice), unless the Validator/Comparator implementer build some non-trivial caching on top. Chrome's Comparator uses non-trivial amounts of CPU time (and therefore battery) during compaction, and I'd be grateful if we didn't add significant overhead there.
@pwnall I would definitely use this feature!
Like you, I'm also concerned about the overhead of validating before performing the actual Compare
, even though that would be a more intuitive API.
When I decode my keys, I perform many checks to avoid reading invalid memory, so being able to communicate from Compare
that the keys are invalid would be a good perf improvement (my Compare
implementation shows up in benchmarks).
I guess another thing to keep in mind is the idea that some keys are valid in the context of Compare
, but not valid as keys to be inserted in the database.
The database might contain:
"tablespace0#123"
"tablespace1#123"
"tablespace1#124"
"tablespace2#123"
and I might create iterators that go from "tablespace1!"
to "tablespace1$"
to cover all "tablespace#"
keys.
To make sure I don't accidentally introduce bad keys into the database, I wrapped the DB class in my own class that validates keys in Debug builds. Having a general Validator *
as @ghemawat proposed would free me from having to do that and make it possible to validate on Release builds as well, but the Validator would have to make the distinction between validations for comparison and validations for insertion.
The API for
leveldb::Comparator::Compare()
has no provision for failure. Failure is a reasonable response when one of the inputs is not a valid application key. Assuming no bugs in LevelDB, this can happen if on-disk keys are corrupted.Example: suppose the on-disk key is a pair of strings, encoded as varint(length of string 1) | string1 | string 2. An input of
0x05 0x01
is then invalid -- s1's length is 5, but the key only has one more byte.