mafintosh / hyperdb

Distributed scalable database
MIT License
752 stars 75 forks source link

Question: Iterator API #105

Open andrewosh opened 6 years ago

andrewosh commented 6 years ago

As of now the iterator API is db.iterator(prefix, opts), and one of thse opts is a boolean gte, which is fine (and extensive) when results are hash-sorted.

With lex-sorted results it makes sense to have lt and lte options, at which point it's odd if those options are strings but gte is a boolean.

I know this is currently an undocumented API that's just used to power db.list, but moving forward would it make sense to stay consistent with Level and change the API to db.iterator(opts), where gt, gte, lt and lte are all string options?

andrewosh commented 6 years ago

I think this suggestion actually mixes two fundamentally different forms of iteration. Instead, it makes more sense to expose two iterators:

  1. A lexicographic iterator (LexIterator) that isn't depth-aware (i.e. you could use it with a very large, flat set of keys ['00000' - 'ffffff']), and supports all the standard Level options.
  2. The current prefix iterator (PrefixIterator), with its existing options (such as recursive, which only makes sense for paths).

The prefix iterator is useful for quick filesystem-like operations on path-like keys, but it's less general than the lexicographic iterator, which is necessary for creating a leveldown. It's worth noting that one could create a prefix iterator out of a lex iterator + correctly encoded keys.

That sound like a better approach? As a proof-of-concept, I've thrown this together in a branch that currently powers a (mostly) functional leveldown.

andrewosh commented 6 years ago

Quick clarification: Creating the LexIterator in the first place requires that the DB be created in lex mode, where keys aren't hashed during path creation. This is discussed more in this issue and implemented in this PR.