travitch / persistent-vector

Persistent vectors for Haskell based on array mapped tries
BSD 3-Clause "New" or "Revised" License
27 stars 4 forks source link

Use SmallArray# instead of Array# #15

Open treeowl opened 4 years ago

treeowl commented 4 years ago

The arrays are fairly small, and, just as importantly, they're used in a "mostly-pure" fashion. So performance will almost certainly be better with SmallArray# than Array#.

travitch commented 4 years ago

Good idea - I think SmallArray# was introduced after this code was started, but I never re-evaluated. I suspect this would help some of the unfortunate GC behavior I've observed.

treeowl commented 4 years ago

What unfortunate behavior have you observed? SmallArray is much more about trying to make the mutator fast; Array tries to avoid certain pathological GC perf problems that just probably don't show up here.

travitch commented 4 years ago

It has been difficult to get a detailed breakdown of where the GC spends its time, but my benchmarks have just spent way more time than I expected collecting garbage. I think it was a combination of small nursery sizes (in older GHC versions) and each modification re-allocating multiple array chunks (to rebuild the spine of the tree).

The documentation for SmallArray indicates that it is more efficient during GC as long as your card table would have only had a single entry (i.e., < 128), which should apply in this case. I'm not sure how much that will help, of course.

treeowl commented 4 years ago

Oh, interesting. I'd never read that. But I can't imagine an unnecessary card table check would explain particularly bad GC performance. Probably something else going on. Of course, performance building really big vectors is basically guaranteed to be shit regardless, just because it violates the generational hypothesis. We can only do what we can do.