google / leveldb

LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
BSD 3-Clause "New" or "Revised" License
36.4k stars 7.8k forks source link

Comparator::Compare() has no way to signal corruption #749

Open pwnall opened 4 years ago

pwnall commented 4 years ago

The API for leveldb::Comparator::Compare() has no provision for failure. Failure is a reasonable response when one of the inputs is not a valid application key. Assuming no bugs in LevelDB, this can happen if on-disk keys are corrupted.

Example: suppose the on-disk key is a pair of strings, encoded as varint(length of string 1) | string1 | string 2. An input of 0x05 0x01 is then invalid -- s1's length is 5, but the key only has one more byte.

pwnall commented 4 years ago

I suspect that the right thing to do when a Comparator returns an error is for LevelDB to stop whatever it's doing, return failure, and mark the database corrupted. We rely on tables being sorted, and corruption may break that fundamental invariant so, for example, MergingIterator would have no good way to proceed. For extra credit, we could have some iterators that skip over corrupted keys, for use in recovery.

As far as extending the API, here's a sketch with bad names.

class LEVELDB_EXPORT Comparator {
 public:
  enum class Ordering {
    kLess = -1,
    kEqual = 0,
    kGreater = 1,
    kBadInput = 2,
  }

  virtual ~Comparator();

  // Three-way comparison. Deprecated. Extend Compare2 instead.
  virtual int Compare(const Slice& a, const Slice& b) const = 0;

  // Key comparison.
  //
  // Returns Ordering::kBadInput if at least of the inputs is not a valid key.
  // ...
  virtual Ordering Compare2(Slice a, Slice b) const {
    // Implements Compare2() in terms of Compare(). This implementation
    // is used by legacy Comparators that don't override Compare2().
    int legacy_ordering = Compare(a, b);
    if (legacy_ordering == 0)
      return Ordering::kEqual;
    if (legacy_ordering < 0)
      return Ordering::kLess;
    return Ordering::kGreater;
  }

  // Override if Compare2() may return Ordering::kBadInput.
  virtual bool IsValid(Slice a) const { return true; }

  // TODO: Think about FindShortest*().
}

IsValid() would be used by iterators that skip over corrupted keys.

ghemawat commented 4 years ago

I am a little leery of adding Compare2 since it is somewhat complicated. And I don't think we need it if we add IsValid. We could instead just give guidance to the user that leveldb will not call Comparator::Compare() if IsValid returns false.

Though if we are considering such validation, we should consider a more general validation approach. E.g., we add a Validator interface and allow an optional "Validator*" to be passed in Options. Validator could contain a method like: bool IsValidKey(Slice); but could at some point be extended with something like the following in the future that can validate values if we wish: bool IsValidEntry(Slice key, Slice value);

pwnall commented 4 years ago

@ghemawat Thank you for looking into this!

Great point about validating values!

I don't have an opinion on that because values are opaque blobs for the database, and Chrome has layered its own corruption checks on top of all storage systems. I'm happy to do what you think is right here.

Having a separate validator is more elegant than my proposal. However, it does incur some overhead (worst case, the key would get de-serialized twice), unless the Validator/Comparator implementer build some non-trivial caching on top. Chrome's Comparator uses non-trivial amounts of CPU time (and therefore battery) during compaction, and I'd be grateful if we didn't add significant overhead there.

felipecrv commented 4 years ago

@pwnall I would definitely use this feature!

Like you, I'm also concerned about the overhead of validating before performing the actual Compare, even though that would be a more intuitive API.

When I decode my keys, I perform many checks to avoid reading invalid memory, so being able to communicate from Compare that the keys are invalid would be a good perf improvement (my Compare implementation shows up in benchmarks).

I guess another thing to keep in mind is the idea that some keys are valid in the context of Compare, but not valid as keys to be inserted in the database.

The database might contain:

"tablespace0#123"
"tablespace1#123"
"tablespace1#124"
"tablespace2#123"

and I might create iterators that go from "tablespace1!" to "tablespace1$" to cover all "tablespace#" keys.

To make sure I don't accidentally introduce bad keys into the database, I wrapped the DB class in my own class that validates keys in Debug builds. Having a general Validator * as @ghemawat proposed would free me from having to do that and make it possible to validate on Release builds as well, but the Validator would have to make the distinction between validations for comparison and validations for insertion.