tweag / ormolu

A formatter for Haskell source code
https://ormolu-live.tweag.io
Other
944 stars 83 forks source link

Fix AST diffing for empty Haddock comments in data decls #1068

Closed amesgen closed 10 months ago

amesgen commented 10 months ago

Closes #1065, by basically amending #818 with another case where Haddock comments can occur

github-actions[bot] commented 10 months ago

🚀 Deployed on https://d5a7bbb4112d7792f021ae4fbfe5853cbd54a445--ormolu-live.netlify.app

tomjaguarpaw commented 10 months ago

Oh no! This is very unfortunate, and so is https://github.com/tweag/ormolu/pull/818. Empty Haddock comments are semantically meaningful. They cause the Haddock output to wrap.

For further information see

Ormolu obliterates this nice formatting :( Granted, using -- ^/-- | to force Haddock to wrap signatures is a hack, but there is no other way (I believe).

amesgen commented 10 months ago

Interesting, that is a very nice trick; but it doesn't seem to work anymore :sob:: You link to setStdin from typed-protocols 0.2.10.1, but in 0.2.11.0 (the current version), this trick doesn't seem to have any effect anymore (which is consistent with the fact that it does not show up as a Haddock comment in the AST anymore; how GHC associates Haddock comments to the AST was changed in GHC 9.0):

Maybe this should be considered to be an upstream GHC issue? If empty Haddock comments appear in the AST again, it will also be easier for Ormolu to preserve them. WDYT?

amesgen commented 10 months ago

I can also confirm locally that your trick works on GHC 8.10, but does no longer work since 9.0.

tomjaguarpaw commented 10 months ago

Thanks for a very speedy reply! Oh, yes, I see. I didn't realise that this had changed, and I didn't realise I hadn't checked the latest documentation. I guess the two versions were generated by the builder at times straddling the change to GHC 9.x for Haddock generation.

Maybe this should be considered to be an upstream GHC issue? If empty Haddock comments appear in the AST again, it will also be easier for Ormolu to preserve them. WDYT?

I suppose that seems reasonable but I'd still prefer to have

even though that won't get me any benefit under GHC 9.x (yet ...). Anyway, it's your application (which I enjoy using a lot) and your effort, so please do what you think is most appropriate.

In fact, now that I come to think of it, I don't understand why formatting data T = T {- ^ -} to data T = T passes the AST equivalency check anyway ...

amesgen commented 10 months ago

Maybe this should be considered to be an upstream GHC issue? If empty Haddock comments appear in the AST again, it will also be easier for Ormolu to preserve them. WDYT?

I suppose that seems reasonable but I'd still prefer to have

* `-- |` preserved

* `-- ^` changed to `-- |`

* only `{- ^ -}` deleted

even though that won't get me any benefit under GHC 9.x (yet ...).

I actually like your suggested behavior (though I would probably not treat {- ^ -} specially and still turn it into -- |); it is just annoying to implement as -- | does not appear in the AST as an Haddock comment, only as a "regular" comment in an EPA extension point, so we would have to manually look in those to find them and treat them like a proper Haddock comment. I have not investigated this in detail, but a priori, the implementation seems too fiddly in relation to the relatively low severity of this stylistic concern.

I think the proper fix is to actually change the GHC parser behavior as mentioned above, I opened https://gitlab.haskell.org/ghc/ghc/-/issues/23935 for that. I think that would also re-enable your Haddock trick, I tagged you there (hope that is fine).

In fact, now that I come to think of it, I don't understand why formatting data T = T {- ^ -} to data T = T passes the AST equivalency check anyway ...

We preprocess the AST to remove Haddock comments that are all blank (this PR is extending this preprocessing step).


I will open a new issue here about preserving blank Haddock comments (as suggested by you) if the reaction in the GHC ticket is positive; once we upgrade to a GHC that has the fix, we can remove the code that eg this PR is touching. In the meantime, I think that this PR is still an incremental improvement (it certainly doesn't make anything worse).

tomjaguarpaw commented 10 months ago

I would probably not treat {- ^ -} specially and still turn it into -- |

I think that would be even better if it fits with the constraints. I still don't fully understand the behaviour of the GHC parser and how that impinges.

I tagged you there (hope that is fine).

Yes, absolutely, thanks for doing that.