haskell / haddock

Haskell Documentation Tool
www.haskell.org/haddock/
BSD 2-Clause "Simplified" License
361 stars 243 forks source link

Caching Generated Markup #1548

Open parsonsmatt opened 1 year ago

parsonsmatt commented 1 year ago

Haddock allocates 800GB when rendering HTML for our codebase. The big culprits appear to be some of our "prelude" modules, which re-export a huge amount of stuff. The biggest offender is 249MB of HTML on disk.

What's worse is that we actually spend a huge amount of time just escaping HTML strings in xhtml library.

One potential solution is to store HTML fragments for a documented item on disk. Then, instead of spending a massive amount of time rebuilding the same docs over and over again, we'd be able to re-use the HTML that's already generated and just splice it in. This culd work by hashing the ExportItem DocNameI that's provided to processExport and using that as a lookup key.

One downside with this is that the instances table currently shows all the instances in scope for a type, which can potentially change at each "re-export" site. Generating these tables is also a huge cost, so saving that work would be nice, particularly if we can do so incrementally.