Open meooow25 opened 1 year ago
IF we change the implementation, I'd argue that primitive
is even better suited for arrays than array
, as it doesn't store boxed indices. One possible problem is if Data.Graph
uses slicing (by adjusting the start/stop indices), but it looks like array
doesn't even provide a function for that.
It might be possible to optimize the layout even more by using the unboxed array directly instead of a nested array, but that would require a second array to store the indices where the vertex lists start.
It seems like a massive oversight to make Graph
a type alias instead of a newtype, but I'm afraid it might be too late to change that now, at least not without a big announcement and a migration strategy (even though the amount of breakage is probably small).
@konsumlamm, I fear the amount of breakage is likely not small, though the number of programs affected likely is. I don't think the (few) users of the module have ever shied away from using what they know about the representation.
I doubt we can use primitive
, since it's not a GHC boot library. We can make our own type for an Int
array with limited operations if we really want.
@treeowl so do you think this is not practical to do, given the amount of breakage?
@meooow25 You could attempt an impact assessment, but I'm not terribly optimistic. You might be better off exploring those ideas in another package, and try to see if you get good performance using that for, e.g., GHC.
Bodigrim's comment about improved allocations https://github.com/haskell/containers/pull/909#issuecomment-1414503913 and #928 makes me wonder if it would be good to represent graphs as
Array Int (UArray Int Int)
or something similar, instead ofArray Int [Int]
. We all know that[Int]
is not a great way to storeInt
s.Downsides: Breaking change, and construction would be very complicated. For example we can't just use
accumArray
to construct from edges inbuildG
. But memory used should be a lot better, 3 words out of 4 saved per edge if I understand correctly. And dfs algorithms should not suffer in any way.