msakai / data-interval

Interval datatype, interval arithmetic, and interval-based containers for Haskell
Other
21 stars 9 forks source link

lookup value and the/a piece-wise constant component around it #36

Open tilowiklundSensmetry opened 3 years ago

tilowiklundSensmetry commented 3 years ago

I am sampling sections of a piece-wise constant function encoded as an IntervalMap, and for efficiency reasons it would be useful to have a lookup function that also reported the maximal (or just some non-trivial) interval around the query point on which the IntervalMap is constant. There is a quickly cooked up prototype here: https://github.com/tilowiklundSensmetry/data-interval/pull/1/commits/8e0d144a1c1eb76122a902a443100e8b9d84845e that I am guessing would give a (not necessarily maximal) such interval, but I am not certain enough about the semantics of IntervalMap to say whether it is correct.

Bodigrim commented 3 years ago

It's hard to specify the semantics of this function without refering to internal implementation details. E. g., the following function satisfies properties described in your haddock:

lookupAround :: Ord k => k -> IntervalMap k a -> Maybe (Interval k, a)
lookupAround k m = case lookup k m of 
  Nothing -> Nothing 
  Just a -> Just (Interval.singleton k, a)
tilowiklundSensmetry commented 3 years ago

I don't think the ideal semantics are difficult to specify (the maximal interval containing the point on which the function is constant). For performance reasons this might be a bit tricky with the current implementation. Adding something like the following might be sufficient:

If m was created by at most n insertions to an empty map, then the cardinality of Set.fromList [ fst (lookAround x1 m), ..., fst (lookAround xk m) ] is O(n).

Bodigrim commented 3 years ago

I think returning the maximal interval is a better semantics, even if implementation is not straightforward.

msakai commented 2 years ago

It is documented that IntervalMap keeps adjacent mappings unmerged even though they ​are connected and mapped to the same value, and toList already returns the result that depends on such representation.

https://github.com/msakai/data-interval/blob/6b62faf1dc4de5a8cad241a10eedf90fa4c2b825/src/Data/IntervalMap/Base.hs#L121-L126

Therefore I feel it is okay for lookupAround to return the result that depends on such representation.

Also, if we implement a function that merges such adjacent mappings (which requires Eq constraint to the value type), returning the maximal interval can be achieved by simply combining the function and lookupAround.