haskell / containers

Assorted concrete container types
https://hackage.haskell.org/package/containers
316 stars 178 forks source link

Visiting tree nodes within a given key range ? #708

Closed complyue closed 4 years ago

complyue commented 4 years ago

I'm not sure why Data.Map doesn't have a key range based visiting API, I figured out I can do it this way:

indexKeyRange
  :: IndexKey -> IndexKey -> Map IndexKey Object -> [(IndexKey, Object)]
indexKeyRange !minKey !maxKey =
  toList
    . takeWhileAntitone (<= maxKey)
    . dropWhileAntitone (< minKey)

But wouldn't it save the computation needed to re-balance the intermediate tree generated ? Or that re-balancing can be optimized out actually ?

I am creating an in-memory graph database, using Data.Map.Strict.Map as business object indices with specified object attributes. The typical scenario will be querying a small number of entries by key range, out of possibly all business objects of a certain class globally, so the implementation above would work, but not reasonable by far as it seems.

I think a lazy list returned by mere node visiting (i.e. no new node creation) would satisfy my needs, or I missed something ?

complyue commented 4 years ago

I end up with a solution slightly modified from https://mail.haskell.org/pipermail/haskell-cafe/2020-March/132004.html

module DB.Storage.InMem.TreeIdx
  ( foldRange
  )
where

import           Prelude

import           Data.Maybe

import           Data.Map.Internal

-- Kudos: Olaf Klinke olf at aatal-apotheke.de
--   https://mail.haskell.org/pipermail/haskell-cafe/2020-March/132004.html

-- name clash with Control.Monad
when :: (Monoid b) => Bool -> b -> b
when t b = if t then b else mempty

contains :: Ord k => (Maybe k, Maybe k) -> k -> Bool
contains (lbound, ubound) k =
  (isNothing lbound || fromJust lbound <= k)
    && (isNothing ubound || k <= fromJust ubound)

foldRange
  :: (Monoid b, Ord k) => ((k, a) -> b) -> (Maybe k, Maybe k) -> Map k a -> b
foldRange f range@(lbound, ubound) m = case m of
  Tip                    -> mempty
  (Bin _ k a left right) -> foldLeft <> this <> foldRight   where
    foldLeft =
      when (isNothing lbound || fromJust lbound < k) (foldRange f range left)
    this = when (range `contains` k) (f (k, a))
    foldRight =
      when (isNothing ubound || k < fromJust ubound) (foldRange f range right)
import qualified DB.Storage.InMem.TreeIdx      as TI

indexKeyRange
  :: IndexSpec
  -> TreeMap.Map IndexKey a
  -> Maybe [Maybe IdxKeyVal]
  -> Maybe [Maybe IdxKeyVal]
  -> [(IndexKey, a)]
indexKeyRange spec tree !minKeyVals !maxKeyVals = TI.foldRange
  (: [])
  (IndexKey spec <$> minKeyVals, IndexKey spec <$> maxKeyVals)
  tree

closing for now.