tweag / ormolu

A formatter for Haskell source code
https://ormolu-live.tweag.io
Other
944 stars 83 forks source link

Cannot parse module with only conditional CPP block #1040

Closed wenkokke closed 1 year ago

wenkokke commented 1 year ago

Describe the bug The GHC parser (in Haddock mode) and by extension Ormolu cannot parse modules that only contain a conditional CPP block. This is possibly an upstream issue with ghc-lib-parser, but I'm filing it here because I'm unsure as GHC compiles the file correctly.

To Reproduce Write the following to Bug.hs:

{-# LANGUAGE CPP #-}

module Bug where

#ifdef flag
constant :: Int
constant = 1312
#endif

Then run:

ormolu Bug.hs

Ormolu fails with:

Bug.hs:1:1
  The GHC parser (in Haddock mode) failed:
  {ErrorWithoutFlag
   lexical error in pragma at character 'i'}

Expected behavior Ormolu outputs the file unchanged.

Environment

amesgen commented 1 year ago

Thanks for the report, this bug was introduced in #994 because we now parse the imports of a module before formatting: https://github.com/tweag/ormolu/blob/fe98934c0571ea1218524950feca8ad95352e345/src/Ormolu/Parser.hs#L96 The problem here is that rawInputStringBuffer is not pre-processed in any way, i.e. it still contains all CPP markers. It seems to be able to cope with some CPP in imports at least, e.g. this is formatted just fine, although only the first import (A) is taken into consideration:

{-# LANGUAGE CPP #-}

module Bug where

import A
#ifdef flag
import B
#endif
import C

One straightforward approach to fix this would be to remove all CPP lines before running parseImports; we could both do it such that we remove or keep import B in this example.