Level / abstract-level

Abstract class for a lexicographically sorted key-value database.
MIT License
128 stars 8 forks source link

Refactor `keys()` and `values()` internals #13

Closed vweevers closed 2 years ago

vweevers commented 2 years ago

Follow-up for #12, to be tackled in a next major version. As written there:

Adds a lot of new code, with unfortunately some duplicate code because I wanted to avoid mixins and other forms of complexity, which means key and value iterators use classes that are separate from preexisting iterators. For example, a method like _seek() must now be implemented on three classes: AbstractIterator, AbstractKeyIterator and AbstractValueIterator. This (small?) problem extends to implementations and their subclasses, if they choose to override key and value iterators to improve performance.

To come up with a more DRY approach, it may help to first reduce the differences between the 3 iterators. Mainly: change the callback signature of AbstractIterator#next() from (err, key, value) to (err, entry). Perhaps (dare I say) remove callbacks altogether.

If we can then merge the 3 classes into one, or at least have a shared and reusable base class, then unit tests can probably be simplified too, not having to repeat like so:

https://github.com/Level/abstract-level/blob/7113ad100d5c65fea0d960e6a161cbc49524de2d/test/async-iterator-test.js#L23-L37

Lastly (unrelated but I postponed it because of the next() callback signature and to avoid more breaking changes) perhaps rename iterator() to entries().

vweevers commented 2 years ago

Although https://github.com/Level/abstract-level/pull/50 aligned the function signatures between the three types of iterators, I don't see a clean way to (further) merge them. The thing is, if we don't separate behavior by classes, then the separation must be made in individual methods, which would be worse. For example, to decode data, we currently do (simplified and in pseudocode):

class CommonIterator {
  async next () {
    const item = await this._next()
    return this.decodeItem(item)
  }
}

class AbstractIterator extends CommonIterator {
  decodeItem (entry) {
    entry[0] = this.keyEncoding.decode(entry[0])
    entry[1] = this.valueEncoding.decode(entry[1])
    return entry
  }
}

class AbstractKeyIterator extends CommonIterator {
  decodeItem (key) {
    return this.keyEncoding.decode(key)
  }
}

If we instead had a single base class, let's say AbstractIterator, then sure, implementations would not have to implement common methods like _seek() three times over:

class ExampleIterator extends AbstractIterator {
  _seek () {
    // Implement once here
  }

  async _next () {
    return ['key', 'value']
  }
}

class ExampleKeyIterator extends ExampleIterator {
  // Override ExampleIterator._next()
  async _next () {
    return 'key'
  }
}

But the downside is: we'd then need to add options to the AbstractIterator constructor telling it what type of iterator it is. And decode accordingly (etc - because decoding was just one example).

Long story short, the current code (after #50) is fine.