haskell / haddock

Haskell Documentation Tool
www.haskell.org/haddock/
BSD 2-Clause "Simplified" License
361 stars 243 forks source link

Using Lazy Text #1554

Closed parsonsmatt closed 2 months ago

parsonsmatt commented 1 year ago

I was investigating the weird perf issues with the Builder stuff, and noticed that GHC 9.4.2 and GHC 9.4.3 have very different performance characteristics. With GHC 9.4.2, Builder performs worse. But with GHC 9.4.3, it's significantly better.

After thorough investigation, I figured out that part of the issue was materializing all these [Char]. Particularly, URLs seemed to be trouble - the xhtml interface only allowed you to use String for them, which isn't good. I patched xhtml to use LText for the Attr, which has a nice balance between performance for concatenation and inspection. This rippled through the code, pushing a ton of String into Text.

Additionally, I noticed that we were parsing things using Parsec as Text, but then unpacking them into [Char]. So I pushed the Text through the codebase.

This resulted in a pretty nice improvement in peformance. My baseline, using ghc-9.4 branch:

!!! ppHtml: finished in 225.55 milliseconds, allocated 1087.121 megabytes
haddock ghc-9.4 branch
  16,056,038,816 bytes allocated in the heap
   2,241,066,456 bytes copied during GC
     193,369,736 bytes maximum residency (14 sample(s))
       5,134,712 bytes maximum slop
             507 MiB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3818 colls,     0 par    1.234s   1.237s     0.0003s    0.0032s
  Gen  1        14 colls,     0 par    0.658s   0.658s     0.0470s    0.1228s

  TASKS: 5 (1 bound, 4 peak workers (4 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.001s  (  0.000s elapsed)
  MUT     time    4.891s  (  5.423s elapsed)
  GC      time    1.892s  (  1.896s elapsed)
  EXIT    time    0.001s  (  0.002s elapsed)
  Total   time    6.785s  (  7.320s elapsed)

  Alloc rate    3,282,509,711 bytes per MUT second

  Productivity  72.1% of total user, 74.1% of total elapsed

Documentation created:
/home/matt/Projects/persistent/dist-newstyle/build/x86_64-linux/ghc-9.4.3/persistent-2.14.4.3/doc/html/persistent/index.html

This branch (which does include the #1552 code, too) has these numbers:

!!! ppHtml: finished in 172.81 milliseconds, allocated 888.132 megabytes

  16,042,480,936 bytes allocated in the heap
   2,227,119,928 bytes copied during GC
     190,551,728 bytes maximum residency (14 sample(s))
       5,122,384 bytes maximum slop
             482 MiB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3745 colls,     0 par    1.222s   1.224s     0.0003s    0.0031s
  Gen  1        14 colls,     0 par    0.703s   0.703s     0.0502s    0.1366s

  TASKS: 5 (1 bound, 4 peak workers (4 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.001s  (  0.001s elapsed)
  MUT     time    4.974s  (  5.507s elapsed)
  GC      time    1.925s  (  1.927s elapsed)
  EXIT    time    0.001s  (  0.006s elapsed)
  Total   time    6.901s  (  7.440s elapsed)

  Alloc rate    3,225,022,895 bytes per MUT second

  Productivity  72.1% of total user, 74.0% of total elapsed
Baseline Lazy Text Difference Improvement
HTML time 225.55 ms 172.81 ms 52 ms 23%
HTML allocation 1087.121 MB 888.132 MB 198.99 MB 18.3%
Max residency 193,369,736 B 190,551,728 2.69 MB 1.5%
Total Memory 507 MB 482 MB 25 MB 4.93%
Allocations 16,056,038,816 B 16,042,480,936 B ~12 MB 0.08%
Time 6.785s 6.901s -0.116s -1.71%

23% faster and 18.3% less memory while generating HTML, though the rest of the code is a tiny bit slower.

There's a long way to go towards making HTML generation more efficient. Timing data shows that we allocate roughly 200 times as much memory as the final HTML file weighs - so a ~1.1MB file on disk allocates ~222MB.

Kleidukos commented 1 year ago

@parsonsmatt Great PR, thanks a lot! I have an adajcent question: Have you considered using OsPath instead of FilePath?

mpickering commented 1 year ago

Am I misunderstanding the numbers here? It seems like the patch makes things slower overall.

9.4.2 had a bug in eta-expansion which was fixed in 9.4.3 (which affected the Builder performance).

It is also impossible to review this patch with the current commit history, can you please clean that up if you're going to pursue this.

parsonsmatt commented 1 year ago

@Kleidukos I'm not aware of OsPath and a Stackage search doesn't bring anything up.

@mpickering This PR is based on two other PRs, and when those are merged, this will be much easier to review. While the overall runtime is slightly slower on this case, the benefit is huge for packages that generate large documentation pages - project-specific Prelude-like modules in particular. Once I get the work app using 9.4.3 or 9.4.4 then I can get a more up-to-date perf numbers.

Kleidukos commented 1 year ago

@Kleidukos I'm not aware of OsPath and a Stackage search doesn't bring anything up.

Hoogle is more reliable then, it's a type from https://flora.pm/packages/@haskell/filepath.

Kleidukos commented 2 months ago

Hi, thank you for this PR, but Haddock now lives full-time in the GHC repository! Read more at https://discourse.haskell.org/t/haddock-now-lives-in-the-ghc-repository/9576.

Let me know if you feel it is still needed, and I'll migrate it. :)