haskell-perf / sequences

Benchmarks for sequence data structures: lists, vectors, etc.
88 stars 11 forks source link

Maybe make a traverse test #24

Open Anton-Latukha opened 2 years ago

Anton-Latukha commented 2 years ago

traverse pure probably suitable to be used for data structure spine traversal "in the data-type-recommended way".

Maybe there even more idiomatic way to do it.

Maybe traverse (const $ Identity ()) would ensure nudging Haskell to not manipulating, maybe even not looking into values at all.

As I am interested to see how data structure spine traversing goes in which data types to understand the impact of the design of data structures on spine traversal & so fold.

Anton-Latukha commented 2 years ago

Length is kinda a spine traversal test, but length :: Foldable t => t a -> Int does not guarantee the real traversal of datatype happening, does traversal happens depends on datatype design & how instance Foldable Type length is optimally retrieved (for example type can store its size or even num of elems literally), length does not guarantee traversal.

Anton-Latukha commented 2 years ago

Also, afaiu Min & Max benchmarks are mostly also mirroring a traverse benchmark. So having a clean representation of traverse would be informative & {Length,Max,Min} are additional benchmarks of particular information getters.

chrisdone commented 2 years ago

I think length is a good benchmark by itself, because Seq etc do indeed track length, but DList/Acc don't, which costs Seq extra constant factors, but wins when calling length. Asking for the size of something is definitely a good test in its own right.

For traversals... I'm not sure how worth it that is. Often times the function being ran will have more overhead than the traversal itself. But feel free to come up with an interesting test. :ok_hand:

Anton-Latukha commented 2 years ago

Yes, since I like folds & traversals & between the two - traversal would be the most direct test of data type spine traverse. With laziness, const should ensure Haskell that the value in the cell should not be looked into. I believe something like void or Identity should be cheap.

If only https://github.com/nomeata/ghc-heap-view/pull/36 merged so ghc-vis was able to update to support new compilers - then it would've been easy to just observe traversal. If I could literally look it in ghc-vis - I'd submitted it right now, but otherwise, I need to appeal to the word of an authority & believe that something traverses only the data type spine.

BTW this may be useful in checking the benchmarks also: https://www.youtube.com/watch?v=I4lnCG18TaY

chrisdone commented 2 years ago

You should use IO to do the traversal if anything. I think Identity would be equivalent to calling map, because GHC sees that it's just a newtype wrapper.

Anton-Latukha commented 2 years ago

Ok. I would look into traverse somewhere along the line :rofl:.