haskell / haddock

Haskell Documentation Tool
www.haskell.org/haddock/
BSD 2-Clause "Simplified" License
361 stars 242 forks source link

hi-haddock for ghc 9.6 #1597

Closed FinleyMcIlwaine closed 1 year ago

FinleyMcIlwaine commented 1 year ago

I'm working on getting hi-haddock over the finish line. The steps I intend to execute are:

This MR is fully complete and ready to merge and I have verified that tests pass (with a patched version of GHC that includes fixes for associated data family documentation). I have observed no performance regressions, only a slight improvement in memory usage for baseline haddock generation (just default cabal haddock). Testing on the Agda codebase, the -s statistics are:

  38,109,588,176 bytes allocated in the heap
  11,122,078,776 bytes copied during GC
     548,141,736 bytes maximum residency (26 sample(s))
       8,119,640 bytes maximum slop
            1552 MiB total memory in use (0 MiB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      9091 colls,     0 par    5.779s   5.876s     0.0006s    0.0191s
  Gen  1        26 colls,     0 par    3.893s   4.089s     0.1573s    0.5272s

  TASKS: 5 (1 bound, 4 peak workers (4 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.004s  (  0.004s elapsed)
  MUT     time    9.167s  ( 10.767s elapsed)
  GC      time    9.672s  (  9.965s elapsed)
  EXIT    time    0.018s  (  0.003s elapsed)
  Total   time   18.862s  ( 20.740s elapsed)

  Alloc rate    4,157,131,469 bytes per MUT second

  Productivity  48.6% of total user, 51.9% of total elapsed

Compared to the baseline performance measured on the current ghc-9.6 branch haddock:

  37,907,128,536 bytes allocated in the heap
  11,181,486,680 bytes copied during GC
     556,370,840 bytes maximum residency (24 sample(s))
       7,191,536 bytes maximum slop
            1581 MiB total memory in use (0 MiB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      9048 colls,     0 par    5.948s   6.002s     0.0007s    0.0049s
  Gen  1        24 colls,     0 par    4.005s   4.191s     0.1746s    0.5561s

  TASKS: 6 (1 bound, 5 peak workers (5 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.004s  (  0.004s elapsed)
  MUT     time    9.778s  ( 11.120s elapsed)
  GC      time    9.952s  ( 10.193s elapsed)
  EXIT    time    0.018s  (  0.011s elapsed)
  Total   time   19.752s  ( 21.327s elapsed)

  Alloc rate    3,876,731,318 bytes per MUT second

  Productivity  49.5% of total user, 52.1% of total elapsed

~Just a small improvement (\~8MB) in maximum residency~. Happy to make changes as requested and provide more metrics if necessary.

EDIT: The metrics noted above were affected by some environment and tooling configurations that were not accounted for at the time that this PR was initially submitted. Fortunately, Hi Haddock has a much larger (positive) impact on Haddock's memory usage. See my blog post here.

FinleyMcIlwaine commented 1 year ago

@Kleidukos let me know if there is anything I can do to make this easier to review. I do not think we need to be concerned about the perf regressions mentioned in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10249#note_491390. I have not been able to reproduce them. These are the perf results I am seeing:

                                          Baseline                        
                    Test    Metric           value       New value Change 
--------------------------------------------------------------------------
   haddock.Cabal(normal) run/alloc  23,567,939,952  24,685,266,616  +4.7% 
    haddock.base(normal) run/alloc  44,852,672,216  46,898,560,216  +4.6% 
haddock.compiler(normal) run/alloc 191,419,016,712 192,057,742,680  +0.3%

Where the baseline is the ghc-9.6 branch (in the GHC repo, not Haddock), and the tested branch includes this work in the haddock submodule.

The recompilation avoidance discussed in the docs is dependent on this GHC MR, so I will mark this as a draft until those changes are merged and backported. I will ping you when/if that occurs 🙂

Kleidukos commented 1 year ago

@FinleyMcIlwaine Thanks for doing this work. :)

FinleyMcIlwaine commented 1 year ago

@Kleidukos The recompilation avoidance section in invoking.rst is now accurate, and this PR is ready for merge. Thanks!