yav / pretty-show

Tools for working with derived Show instances in Haskell.
MIT License
57 stars 15 forks source link

add a convenience function for pPrinting list like objects efficiently #23

Closed BlakeHepner closed 6 years ago

BlakeHepner commented 6 years ago

So I have no idea how to make a pull request with git/github, but adding this function is likely easier this way anyways:

Currently, if you pPrint a list-like object, it requires the whole object to be loaded into memory before any of it will print, even if the underlying list-like object can be streamed. After a bit of trial and error, I made the following function which can pretty print any list-like object efficiently. I was wondering if you could add it to your library:

pPrintList :: (Foldable t, Show a) => t a -> IO ()
pPrintList x = putStrLn $ unlines $ ppShow <$> (toList x)

toList is from Data.Foldable, unlines is from the Prelude.

To give an idea of usage:

on a list of numbers [1..1000000]

using pPrint on it directly will consume ~1.4GB of memory at max.

While using pPrintList will consume only ~1MB of memory at max.

pPrintList used:

$ stack exec pPrint -- +RTS -s > /dev/null
   3,655,126,344 bytes allocated in the heap
       8,995,832 bytes copied during GC
          56,816 bytes maximum residency (2 sample(s))
          21,008 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      7051 colls,     0 par    0.024s   0.023s     0.0000s    0.0001s
  Gen  1         2 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.649s  (  0.663s elapsed)
  GC      time    0.024s  (  0.023s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.724s  (  0.686s elapsed)

  Alloc rate    5,628,536,144 bytes per MUT second

  Productivity  96.6% of total user, 96.6% of total elapsed

gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0

pPrint directly applied:

$ stack exec pPrint -- +RTS -s > /dev/null
   5,578,543,472 bytes allocated in the heap
   3,707,552,872 bytes copied during GC
     559,616,216 bytes maximum residency (14 sample(s))
      13,602,856 bytes maximum slop
            1402 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     10471 colls,     0 par    1.825s   1.842s     0.0002s    0.0018s
  Gen  1        14 colls,     0 par    1.444s   1.839s     0.1314s    0.6499s

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.000s  (  0.001s elapsed)
  MUT     time    1.486s  (  1.457s elapsed)
  GC      time    3.269s  (  3.681s elapsed)
  EXIT    time    0.002s  (  0.048s elapsed)
  Total   time    4.875s  (  5.187s elapsed)

  Alloc rate    3,753,862,321 bytes per MUT second

  Productivity  32.9% of total user, 29.0% of total elapsed

gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0
yav commented 6 years ago

Thanks good idea! I've added the function you suggested and some related functions (ppShowList, and ppDocList). The only difference is that the result is printed as a list (i.e., with square brackets and commas in between). I chose to do it that way, because one of the goals of pretty-show is that the output is standard Show compatible as much as possible, but also human readable.

The new version should be on hackage, 1.6.14. Let me know if you find any issues.