haskell / haddock

Haskell Documentation Tool
www.haskell.org/haddock/
BSD 2-Clause "Simplified" License
361 stars 243 forks source link

Haddock crash when table contains "unicode" symbols #1578

Open guibou opened 1 year ago

guibou commented 1 year ago

The following code is crashing haddock:

{- |

+-----+
| ✅  |
+-----+

-}
module Toto where
$ haddock --version
Haddock version 2.27.0, (c) Simon Marlow 2006
Ported to use the GHC API by David Waern 2006-2008
$ which haddock
/nix/store/80k5c2yalbmmgny0np0y7ayd864xqpj3-ghc-9.4.4/bin/haddock
$ haddock Toto.hs  

<no location info>: error:
    Data.Text.Internal.Fusion.Common.index: Index too large
CallStack (from HasCallStack):
  error, called at libraries/text/src/Data/Text/Internal/Fusion/Common.hs:1180:24 in text-2.0.1:Data.Text.Internal.Fusion.Common
  streamError, called at libraries/text/src/Data/Text/Internal/Fusion/Common.hs:1080:33 in text-2.0.1:Data.Text.Internal.Fusion.Common
  indexI, called at libraries/text/src/Data/Text/Internal/Fusion.hs:249:9 in text-2.0.1:Data.Text.Internal.Fusion
  index, called at libraries/text/src/Data/Text.hs:1839:13 in text-2.0.1:Data.Text
  index, called at utils/haddock/haddock-library/src/Documentation/Haddock/Parser.hs:464:17 in main:Documentation.Haddock.Parser
haddock: Cannot typecheck modules

This is highly sensible to whitespaces, for example:

+-----+
| ✅   |
+-----+

works.

I suspect that the problem is because the line length are checked based on byte number or number of characters, which does not match because of the encoding.

This is known, see https://github.com/haskell/haddock/pull/718#issuecomment-353806434, where @phadej says:

There /will/ be a problem with UTF-8 as for tables we need to count characters. I won't do anything for that at this point.

I'm mostly opening the ticket for reference.

This being said, it may be possible to be more robust and generate an invalid table or a more graceful crash.

Note: I'm using haddock with ghc 9.4 which uses text 2, but I've also observed the problem with ghc 9.2 and text 1.2.

Kleidukos commented 1 year ago

Thanks you for reporting this! :heart: