tweag / ormolu

A formatter for Haskell source code
https://ormolu-live.tweag.io
Other
950 stars 83 forks source link

Format files in parallel #1129

Closed michaelpj closed 1 month ago

michaelpj commented 2 months ago

Closes #1128

There is no explicit configuration for this, but it is possible for users to override by changing the RTS options to the ormolu executable.

mrkkrp commented 2 months ago

Thanks! I imagine you have verified that this indeed works and results in performance benefits? We have some checks on the format of the .cabal file, could you fix the lint issue by running cabal format?

amesgen commented 2 months ago

We have some checks on the format of the .cabal file, could you fix the lint issue by running cabal format?

Not directly related to this PR, but I just got reminded that we use cabal format which is removing the comments; we could switch to eg cabal-gild which preserves this (and is in general much better than cabal format). Any objections/shall I create a PR switching to that?

michaelpj commented 2 months ago

I imagine you have verified that this indeed works and results in performance benefits?

Yes, as I mentioned here I got a pretty good speedup (3x) on my usecase (formatting about 2.5k files).

We should check that, and either add some synchronization or check that -N1 produces ungarbled output and tell users to use that if they want ungarbled output.

Ah right, I have only been checking with --mode inplace. This is a bit awkward, since in principle we need to lock around all terminal writes. I think all the writes happen in app/Main.hs, so perhaps we can still do this reasonably cheaply.

I think we don't need to change sth here for correctness, but there actually might be some redundant work when multiple threads try to compute the value for the same cache key?

We could also just make this thread-safe by using an MVar instead of an IORef, right?

amesgen commented 2 months ago

We should check that, and either add some synchronization or check that -N1 produces ungarbled output and tell users to use that if they want ungarbled output.

Ah right, I have only been checking with --mode inplace.

Maybe we could only do parallel processing for --mode inplace? :thinking:

This is a bit awkward, since in principle we need to lock around all terminal writes. I think all the writes happen in app/Main.hs, so perhaps we can still do this reasonably cheaply.

We have this: https://github.com/tweag/ormolu/blob/7a334ac3a05f3c3bf9b4516b31b7ad4e74e5aacf/src/Ormolu/Fixity/Internal.hs#L275-L276 But it is only emitted when --debug is set.

I think we don't need to change sth here for correctness, but there actually might be some redundant work when multiple threads try to compute the value for the same cache key?

We could also just make this thread-safe by using an MVar instead of an IORef, right?

Yes :+1:

michaelpj commented 2 months ago

Maybe we could only do parallel processing for --mode inplace? 🤔

I think it would be nice if it worked for --mode check too. TBH, I think --mode stdin is a bit questionable with multiple files even without parallelism... do people use that?

amesgen commented 2 months ago

Maybe we could only do parallel processing for --mode inplace? 🤔

I think it would be nice if it worked for --mode check too.

Oh yeah, --mode check should also work with parallelism, definitely :+1:

TBH, I think --mode stdin is a bit questionable with multiple files even without parallelism... do people use that?

Yeah, agreed, I can't imagine a use case.

mrkkrp commented 2 months ago

Any objections/shall I create a PR switching to that?

Indeed, it would be nice to have a formatter for .cabal files that preserves comments.

michaelpj commented 1 month ago

I've continued to allow formatting multiple files with mode stdin, it's just going to be undefined what order you get the output in. I guess in principle if people were using this they might care about the order, but perhaps we can just wait and see if anyone complains? Fixing it would be more work for probably no users.

michaelpj commented 1 month ago

Also this should probably be documented, not sure where.

mrkkrp commented 1 month ago
  • which is a breaking change to the Utils module

I think we only consider the Ormolu module as stable, so changes in Ormolu.Utils.IO should be fine.

I've continued to allow formatting multiple files with mode stdin, it's just going to be undefined what order you get the output in. I guess in principle if people were using this they might care about the order, but perhaps we can just wait and see if anyone complains? Fixing it would be more work for probably no users.

I think it is totally fine. If anyone uses formatting through stdin on multiple files they can complain in the issue tracker. Not sure we even need to document that.

mrkkrp commented 1 month ago

@amesgen I think this is ready to be merged. What do you think?

mrkkrp commented 1 month ago

Thanks!