tweag / ormolu

A formatter for Haskell source code
https://ormolu-live.tweag.io
Other
958 stars 83 forks source link

Weird Haddock comments can't be formatted without `--unsafe` #726

Closed amesgen closed 2 years ago

amesgen commented 3 years ago

Describe the bug Since #722, certain weird Haddock comments can not be formatted without --unsafe.

To Reproduce Try to format BufferMode.hs from leksah, or this minimized snippet:

-- |
--
---
module Test where

Expected behavior Not yet decided, but it should not error.

amesgen commented 3 years ago

EDIT This analysis is incorrect, see comment below.


This appears to be a regression in ghc-lib-parser-9.0 and should be fixed in the upgrade to ghc-lib-parser-9.2 (which should be released in August according to the GHC status update :tada:).


The third line (the --- ) is actually not necessary for this issue. Right now, with --unsafe, the snippet

-- |
--
module Test where

is formatted into

-- |
module Test where

and again into

module Test where

(this is similiar to empty Haddock comments on function arguments not appearing in the AST since ghc-lib-parser-9.0).

Previously, in 0.1.4.1 with ghc-lib-parser-8.10, the first snippet would be formatted into the second one idempotently, and the AST did not differ as we drop trailing blank lines when printing/comparing Haddock comments. But now, with ghc-lib-parser-9.0, the parsed AST of the second snippet does no longer contain a Haddock comment, in contrast to the first snippet, so the AST differs.

amesgen commented 3 years ago

This issue is actually caused by invalid Haddock comments (cf. #803). Namely, this is a valid Haddock comment:

-- |
--
module Test where

which is formatted to

-- |
module Test where

as blank lines are collapsed. But in the second snippet, the Haddock comment is now invalid for some reason.

I think the best approach to fix this is to just drop (valid and invalid) Haddock comments if they only contain spaces.


For complete madness, consider how the GHC parser handles empty Haddock comments at different places:

-- |
module Test
  ( -- |
    test,
  )
where

-- |
test ::
  -- |
  test

Here, confusingly, the second and third Haddock comments are valid, but the first and second one are invalid. Hence, Ormolu formats this as

--
module Test
  ( -- |
    test,
  )
where

-- |
test ::
  --
  test

which seems very random. With the proposed change, it would be formatted as

module Test
  ( test,
  )
where

test ::
  test

which is a clear improvement IMO.