Closed MorrowM closed 1 year ago
I imagine that instead of a new rule, one can just express lookup
via foldr
and benefit from existing rewrite rules.
lookup a = foldr (\(a', b') acc -> if a == a' then Just b' else acc) Nothing
Could someone check Core please?
I imagine that instead of a new rule, one can just express
lookup
viafoldr
and benefit from existing rewrite rules.
I believe this isn't usually done in base since a user compiling with -O0
will end up with a slower program. For example main = print $ lookup 100_000_000 (zip [0 ..] [0 ..])
runs about 33% slower on my machine if I use your lookup
. One other option is to have a rewrite rule that rewrites the recursive lookup
into a foldr
implementation, but that seems less common, looking at existing implementations for things like GHC.List.and
and GHC.List.elem
.
I've confirmed that the RULE
{-# RULES
"lookup/foldr" forall x.
lookup x =
foldr (\(k, v) r -> if x == k then Just v else r) Nothing
#-}
produces the nice Core for the linked example.
This is still O(n) presumably? If so could someone elaborate the point of this? Why bother to pick up a constant factor improvement in something that is asymptotically suboptimal?
I believe this isn't usually done in base since a user compiling with
-O0
will end up with a slower program.
I do not recall much discussion on performance of base
under -O0
virtually ever, and I don't think it's relevant as long as asymptotics does not change.
Why bother to pick up a constant factor improvement in something that is asymptotically suboptimal?
It's a constant factor in time complexity, but potentially an asymptotic improvement for space complexity, from O(n) to O(1), if everything fused away.
Not space complexity, but allocation. That mostly matters for concurrent programming, but it matters a lot. Whether the operation is asymptomatically good depends on how it's used. Remember that the fusion rule will generally only fire if the list is used only once. It likely represents checking for the result of some iterative process, which might be exactly what's needed.
This is still O(n) presumably? If so could someone elaborate the point of this? Why bother to pick up a constant factor improvement in something that is asymptotically suboptimal?
If your association list is being produced by a good producer that uses build
and doesn't need to be reused then that's exactly when you'd want to use a fusion-friendly lookup
. It's better than using a Map
if you don't need repeated lookups.
I do not recall much discussion on performance of base under -O0 virtually ever, and I don't think it's relevant as long as asymptotics does not change.
I'm simply going based off of prior art. Many functions in GHC.List
and GHC.Base
are written such that they use recursion and get rewritten to fusion-friendly versions. I'm not sure if I read the bit about -O0
somewhere or if that's a justification I came up with some time ago, but I don't think it'd hurt to give -O0
users a better dev experience. (I would be interested to hear if this is indeed the reason for all the rewrite rules in GHC.List
if anyone happens to know, they seem to predate the GHC Gitlab history.)
Though I still struggle to wrap my head around fusion on the best of days, this seems like a straightforward improvement. I prefer the foldr
rule to the one that explicitly mentions build
.
I imagine that instead of a new rule, one can just express
lookup
viafoldr
and benefit from existing rewrite rules.lookup a = foldr (\(a', b') acc -> if a == a' then Just b' else acc) Nothing
Could someone check Core please?
I would like to reiterate @Bodigrim's question here. Can we just change the implementation of lookup
to use foldr
?
I find the fusion framework unreliable in general, so unless there's a strong reason, I'd go for changing the implementation explicitly.
I find the fusion framework unreliable in general, so unless there's a strong reason, I'd go for changing the implementation explicitly.
Switching to a foldr
can only help if fusion happens, and may hurt in some ways if it doesn't get inlined/doesn't get inlined soon enough. Adding a RULE
seems to me to be reasonable (this is base
, after all, so shouldn't we strive to make it as efficient as possible?). In any case, the reliability of fusion shouldn't make a difference, since this only helps if fusion happens, and can't hurt if it doesn't (it is a fusion rule, this is what fusion is).
Even if we do want to stick to using foldr
over using RULES
, then we should do it for all the other functions that do this, no?
Speaking of, I do want to ask @treeowl: do you know why so many functions in GHC.List
do this (ex: and
, or
, elem
, any
)? I saw you in the history of this GHC wiki page, which is why I ask.
Rules are one of the most brittle features of GHC. It takes lots of effort (and trial-and-error) to maintain current system, and each new rule makes me shiver. Simply rewriting lookup
via foldr
and automatically benefiting from all existing and future rules involving foldr
is a much more appealing option to me.
I don't think I ever saw people seriously considering performance under -O0
. Might be worth attention if it was asymptotically worse or like 10x worse, but 33% slower evaluation in GHCi is next to nothing in my books. The current proliferation of manual recursion in Data.List
is most likely a historical accident.
is most likely a historical accident.
The implementation is specified by the report: https://www.haskell.org/onlinereport/haskell2010/haskellch9.html
@Bodigrim Do you intend to remove the rewrite rules from and
, any
, elem
, etc?
To be clear, I'm fine with just writing it as foldr
, I just think it would be best to not make base
even more inconsistent than it already is.
@Bodigrim Do you intend to remove the rewrite rules from
and
,any
,elem
, etc?
It's up to you as a proposer to decide what you wish to do. My personal take is that we teach new joiners to avoid manual recursion, and it would be nice to show-case this virtue in base
.
FWIW GHC.List.{and,any,elem}
are of little significance: they are not re-exported from Data.List
or Prelude
, which prefer members of Foldable
instead. Still would be nice to rectify them.
I recently worked on a package for infinite lists and was able to express pretty much everything via foldr
, keeping fusion at the same scale as in base
: https://github.com/Bodigrim/infinite-list/blob/master/src/Data/List/Infinite.hs
Alright then. Final proposal is to replace the current, manually recursive implementation of lookup
with
lookup a = foldr (\(a', b') acc -> if a == a' then Just b' else acc) Nothing
@Bodigrim I'd like to trigger a vote.
@MorrowM could you please prepare a GHC MR before a vote?
Final proposal is to replace the current, manually recursive implementation of lookup with lookup a = foldr ((a', b') acc -> if a == a' then Just b' else acc) Nothing
Can I urge that this code is accompanied with a Note explaining the thinking behind it? There has been quite a lot of discussion on this thread -- the Note could distil the learnings from those insights.
The danger is ending up with a cryptic one-liner, and our future selves in 10 years time change it back
Perhaps we can simply comment out the existing implementation without entirely deleting it, as a sanity check for those unlucky souls still changing Data.List
in ten years? It is defined in the Report after all.
There are CPP pragmas for using the report definition elsewhere, so this can be used here too.
Hmm, this makes me more hesitant to simply go with the "implement it as foldr
" route. If we look at the full source for GHC.List.and
we see
and :: [Bool] -> Bool
#if defined(USE_REPORT_PRELUDE)
and = foldr (&&) True
#else
and [] = True
and (x:xs) = x && and xs
{-# NOINLINE [1] and #-}
{-# RULES
"and/build" forall (g::forall b.(Bool->b->b)->b->b) .
and (build g) = g (&&) True
#-}
#endif
It seems base
goes out of its way to explicitly not use foldr
as the standard implementation even if it means using CPP and a RULE.
The current consensus in this thread seems to be to do
lookup :: (Eq a) => a -> [(a,b)] -> Maybe b
#if defined(USE_REPORT_PRELUDE)
lookup _key [] = Nothing
lookup key ((x,y):xys)
| key == x = Just y
| otherwise = lookup key xys
#else
lookup key = foldr (\(x,y) acc -> if key == x then Just y else acc)
#endif
which is the exact opposite of what and
does! This makes me a bit hesitant.
It seems
base
goes out of its way to explicitly not usefoldr
as the standard implementation even if it means using CPP and a RULE.
This part of base
is extremely old and its first iterations predate Haskell 98 standard. I suspect that the historical process was vice versa: the earliest definition of and
was derived from HBC stdlib:
module P_List_and where
and :: [Bool] -> Bool
and [] = True
and (x:xs) = x && and xs
Later, once Haskell 98 was out, a new, report-compliant definition was added, but never enabled by default.
Bottom line is that I'm not very sympathetic to the argument "our ancestors did it this way, let's cargo cult it".
It is defined in the Report after all.
Strictly speaking, it's only Haskell 98 Report which defines stdlib implementation. IIANM Haskell 2010 Report describes semantics, but does not prescribe any specific implementation.
Alright then, that seems to put any fears to rest. Should the #if defined(USE_REPORT_PRELUDE)
part of the lookup
definition be included? Or is that unnecessary? I do intend to put in a separate Note
explaining this whole ordeal either way.
I personally do not see much point in #if defined(USE_REPORT_PRELUDE)
, we are too far away past Report by now.
@Bodigrim GHC MR is up: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10715
I imagine that instead of a new rule, one can just express
lookup
viafoldr
and benefit from existing rewrite rules.lookup a = foldr (\(a', b') acc -> if a == a' then Just b' else acc) Nothing
Could someone check Core please?
All results with optimisations enabled.
lookup_old :: (Eq a) => a -> [(a,b)] -> Maybe b
lookup_old _key [] = Nothing
lookup_old key ((x,y):xys)
| key == x = Just y
| otherwise = lookup_old key xys
{-# OPAQUE lookup_old #-}
lookup_new :: Eq a => a -> [(a,b)] -> Maybe b
lookup_new key = foldr (\(x,y) acc -> if key == x then Just y else acc) Nothing
{-# OPAQUE lookup_new #-}
main = print $ lookup_old 10_000_000 (zip [0 ..] [0 ..])
-- main = print $ lookup_new 10_000_000 (zip [0 ..] [0 ..])
Perf (esp. allocs) seems to be the same:
Just 10000000
1,680,051,448 bytes allocated in the heap
61,600 bytes copied during GC
44,328 bytes maximum residency (2 sample(s))
29,400 bytes maximum slop
5 MiB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 405 colls, 0 par 0.003s 0.003s 0.0000s 0.0000s
Gen 1 2 colls, 0 par 0.000s 0.000s 0.0001s 0.0001s
INIT time 0.000s ( 0.000s elapsed)
MUT time 0.394s ( 0.520s elapsed)
GC time 0.003s ( 0.003s elapsed)
EXIT time 0.000s ( 0.007s elapsed)
Total time 0.397s ( 0.530s elapsed)
%GC time 0.0% (0.0% elapsed)
Alloc rate 4,264,646,419 bytes per MUT second
Productivity 99.2% of total user, 98.0% of total elapsed
The Core is a bit different, but only because foldr
introduces a join point for the loop (which has no overhead whatsoever):
lookup_new
= \ (@a_a16M)
(@b_a16N)
($dEq_a16O :: Eq a_a16M)
(key_aHg [OS=OneShot] :: a_a16M)
(eta_B0 [OS=OneShot] :: [(a_a16M, b_a16N)]) ->
joinrec {
go1_a2fZ [Occ=LoopBreaker, Dmd=SCS(L)]
:: [(a_a16M, b_a16N)] -> Maybe b_a16N
[LclId[JoinId(1)(Just [!])], Arity=1, Str=<1L>, Unf=OtherCon []]
go1_a2fZ (ds_a2g0 :: [(a_a16M, b_a16N)])
= case ds_a2g0 of {
[] -> GHC.Maybe.Nothing @b_a16N;
: y_a2g3 ys_a2g4 ->
case y_a2g3 of { (x_aSO, y1_aSP) ->
case == @a_a16M $dEq_a16O key_aHg x_aSO of {
False -> jump go1_a2fZ ys_a2g4;
True -> GHC.Maybe.Just @b_a16N y1_aSP
}
}
}; } in
jump go1_a2fZ eta_B0
This is the original lookup_old
recurses directly (which makes no measurable operational difference):
lookup_old
= \ (@a_a176)
(@b_a177)
($dEq_a178 :: Eq a_a176)
(_key_atq :: a_a176)
(ds_d1m9 :: [(a_a176, b_a177)]) ->
case ds_d1m9 of {
[] -> GHC.Maybe.Nothing @b_a177;
: ds1_d1mi xys_atu ->
case ds1_d1mi of { (x_ats, y_att) ->
case == @a_a176 $dEq_a178 _key_atq x_ats of {
False -> lookup_old @a_a176 @b_a177 $dEq_a178 _key_atq xys_atu;
True -> GHC.Maybe.Just @b_a177 y_att
}
}
}
After discussion with @simonpj and @sgraf812, we decided that it's better to go with the original plan of a single rewrite rule
{-# RULES
"lookup/build" forall x (g :: forall b. ((k, a) -> b -> b) -> b -> b).
lookup x (build g) = g (\(k, v) r -> if x == k then Just v else r) Nothing
#-}
Dear CLC members, let's vote on the proposal to add a rewrite rule for Data.List.lookup
, allowing it to fuse with good producers. The MR, tests and technical discussion can be found at https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10715.
@tomjaguarpaw @chshersh @mixphix @hasufell @angerman @parsonsmatt
+1 from me, this is a nice improvement, reducing allocations in certain scenarios.
+1
+1
I still don't understand why people who want performance are using association lists though ...
+1
+1 to the proposal and +1 to the @tomjaguarpaw's comment above
Thanks all, 5 votes are enough to approve.
Currently, the
lookup
function from the prelude does not have any list fusion facilities. I propose we add a RULE allowing it to fuse withbuild
:The obvious case where this is useful is when you have some function generating this list of tuples using
build
.One less-obvious case where this would be useful (which happens to be the case I found this issue in) is as follows. Suppose you have a small sum type
Foo
and you want to write an invertible functionFoo -> Int
. Instead of implementing this by hand you could write a list of input output pairs and then write both functions as lookups on this list. Since GHC desugars lists usingbuild
you can make the list disappear from the generated Core.Generated Core (using GHC 9.6.2):
Of course, this nice Core output relies on the proposed fusion RULE.
Playground link