haskell / haddock

Haskell Documentation Tool
www.haskell.org/haddock/
BSD 2-Clause "Simplified" License
361 stars 243 forks source link

Avoid Code Generation by Disabling TemplateHaskell #1547

Closed parsonsmatt closed 1 year ago

parsonsmatt commented 1 year ago

I did a profiling run with the persistent-test and got this profiteur result

The time is completely dominated by code generation. 63% of time and 67% of allocations!

I'm passing --optghc=-fno-code and even trying to set the DynFlags to not build code, but it's ignoring it.

With the output from withTimings, here's an example module:

[41 of 41] Compiling UpsertTest       ( src/UpsertTest.hs, /tmp/ghc927461_0/ghc_86.o )
*** Parser [UpsertTest]:
!!! Parser [UpsertTest]: finished in 1.48 milliseconds, allocated 4.784 megabytes
*** Renamer/typechecker [UpsertTest]:
*** processModule:
*** createInterface:
!!! createInterface: finished in 0.40 milliseconds, allocated 2.046 megabytes
!!! processModule: finished in 0.66 milliseconds, allocated 2.371 megabytes
!!! Renamer/typechecker [UpsertTest]: finished in 17.32 milliseconds, allocated 32.451 megabytes
*** Desugar [UpsertTest]:
Result size of Desugar (before optimization)
  = {terms: 4,246, types: 4,980, coercions: 882, joins: 0/672}
Result size of Desugar (after optimization)
  = {terms: 2,890, types: 3,218, coercions: 701, joins: 0/160}
!!! Desugar [UpsertTest]: finished in 5.89 milliseconds, allocated 11.031 megabytes
*** Simplifier [UpsertTest]:
Result size of Simplifier iteration=1
  = {terms: 3,069, types: 3,432, coercions: 1,027, joins: 0/166}
Result size of Simplifier
  = {terms: 3,048, types: 3,412, coercions: 1,023, joins: 0/165}
!!! Simplifier [UpsertTest]: finished in 9.64 milliseconds, allocated 17.176 megabytes
*** CoreTidy [UpsertTest]:
!!! CoreTidy [UpsertTest]: finished in 0.75 milliseconds, allocated 2.167 megabytes
Result size of Tidy Core
  = {terms: 3,048, types: 3,412, coercions: 1,023, joins: 0/165}
*** CorePrep [UpsertTest]:
Result size of CorePrep
  = {terms: 5,098, types: 5,952, coercions: 1,023, joins: 0/1,144}
!!! CorePrep [UpsertTest]: finished in 4.04 milliseconds, allocated 6.778 megabytes
*** CoreToStg [UpsertTest]:
*** Stg2Stg:
!!! CoreToStg [UpsertTest]: finished in 2.15 milliseconds, allocated 5.676 megabytes
*** CodeGen [UpsertTest]:
!!! CodeGen [UpsertTest]: finished in 228.48 milliseconds, allocated 258.712 megabytes
*** WriteIface [/tmp/ghc927461_0/ghc_85.hi]:
!!! WriteIface [/tmp/ghc927461_0/ghc_85.hi]: finished in 0.22 milliseconds, allocated 1.194 megabytes

The module in question doesn't even use TemplateHaskell, so that can't be it.

processModule (the code that Haddock actuall needs) only takes 0.66ms. Meanwhile the CodeGen takes 228ms. If we could eliminate CodeGen step, then we could save an enormous amount of time.

Saving this issue as a follow-up point for later...

mpickering commented 1 year ago

I'm pretty sure this is because `persistent-test has ```-XTemplateHaskell in the default-extensions. You need to run TH splices in order to typecheck a module (and hence generate documentation) and hence you need code generation.

parsonsmatt commented 1 year ago

Huh, you're right. TemplateHaskell implies code generation, even if the file doesn't have any TemplateHaskell use. Setting `{-# LANGUAGE NoTemplateHaskell{,Quotes}, NoQuasiQuotes #-} does disable the codegen phase and save a bunch of time.

I'll update the ticket title.

parsonsmatt commented 1 year ago

Oh, you know what, -fprefer-byte-code is probably exactly what we want here, isn't it? 😄