ndmitchell / hlint

Haskell source code suggestions
Other
1.47k stars 196 forks source link

HLint slow on larger files #983

Open phadej opened 4 years ago

phadej commented 4 years ago
% wc cabal-install/Distribution/Client/Setup.hs   
  2709  12689 112567 cabal-install/Distribution/Client/Setup.hs

With hlint-3.1:

% time hlint cabal-install/Distribution/Client/Setup.hs
hlint cabal-install/Distribution/Client/Setup.hs  3,64s user 0,09s system 99% cpu 3,741 total

Where hlint-2.0.15:

hlint cabal-install/Distribution/Client/Setup.hs  1,19s user 0,06s system 98% cpu 1,261 total

i.e. I see 3x regression in performance.

shayne-fletcher commented 4 years ago

Hi @phadej ! Can you please advise on your GHC version? Is it 8.10.1 by any chance?

shayne-fletcher commented 4 years ago

In the event it is, would you mind reporting your benchmark with an HLint built this way?


cabal new-build  --constraint "hlint +ghc-lib" --constraint "ghc-lib-parser-ex -auto" --constraint "ghc-lib-parser-ex -no-ghc-lib"

I don't expect it to make a difference but still would be good to be sure.

phadej commented 4 years ago

I’m using GHC-8.8.3 (HLint-2.0 was compiled with GHC-8.6.5)

On 8. May 2020, at 15.39, Shayne Fletcher notifications@github.com wrote:

 In the event it is, would you mind reporting your benchmark with an HLint built this way?

cabal new-build --constraint "hlint +ghc-lib" --constraint "ghc-lib-parser-ex -auto" --constraint "ghc-lib-parser-ex -no-ghc-lib" — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

ndmitchell commented 4 years ago

Thanks for the report and the test case. Profiling shows we burn about 12% lexing, 68% of the user-defined hints, 18% on the built-in hints. Zooming into the user-defined hints, 50% (of total runtime) goes into unifyExpr, of which 26% goes into scope matching. I've found and fixed a few super-low-hanging fruit (~0.3s), but it's mostly shuffling the time around a bit. What we really need to do is:

Naturally I'm keen to do the first step first, since optimising the wrong code can often make it harder to fix the right code. In the short term I don't think there will be many improvements though.

phadej commented 4 years ago

Testing 3.2 vs 2.1.17, Duplicate is slow.

3.2

Timing Initialise
  0.00s global flags
  0.00s TOTAL

Timing Parse
  0.08s ProjectPlanning.hs
  0.08s TOTAL

Timing Config
  0.09s data/hlint.yaml
  0.09s TOTAL

Timing Hint
  2.98s Duplicate
  1.42s Match apply
  0.26s Extensions
  0.12s List
  0.07s Bracket
  0.06s Pattern
  0.02s Lambda
  0.01s ListRec
  0.01s Monad
  0.00s Other items (7)
  4.96s TOTAL

Timing TOTAL
  4.96s Hint
  0.09s Config
  0.08s Parse
  0.00s Initialise
  5.13s TOTAL

Took 6.11s

2.1.17

Timing Config
  0.04s /cabal/store/ghc-8.6.5/hlint-2.1.17-ae0b98d92ed036e15c5c3d8506e30faa79e6751fbdf5b13c105857a1806c08f5/share/hlint.yaml
  0.04s TOTAL

Timing Parse
  0.13s ProjectPlanning.hs
  0.13s TOTAL

Timing Hint
  1.60s Match apply
  0.09s Extensions
  0.08s Bracket
  0.07s Import
  0.04s Pattern
  0.03s List
  0.01s Duplicate
  0.01s Lambda
  0.01s ListRec
  0.01s Other items (7)
  1.94s TOTAL

Timing TOTAL
  1.94s Hint
  0.13s Parse
  0.04s Config
  2.10s TOTAL

Took 2.12s
ndmitchell commented 4 years ago

Duplicate is gone as of #1150 and v3.2.1 so that shouldn't be an issue anymore.

phadej commented 4 years ago

Setup.hs as in the original issue

HLint 3.2.1

Timing Initialise
  0.00s global flags
  0.00s TOTAL

Timing Parse
  0.08s ./cabal-install/src/Distribution/Client/Setup.hs
  0.08s TOTAL

Timing Config
  0.10s data/hlint.yaml
  0.10s TOTAL

Timing Hint
  1.47s Match apply
  0.30s Extensions
  0.11s Pattern
  0.11s Bracket
  0.10s Duplicate
  0.08s List
  0.03s Lambda
  0.01s ListRec
  0.01s Monad
  0.01s Other items (7)
  2.23s TOTAL

Timing TOTAL
  2.23s Hint
  0.10s Config
  0.08s Parse
  0.00s Initialise
  2.41s TOTAL

Took 3.05s

HLint 2.1.17

Timing Config
  0.04s /cabal/store/ghc-8.6.5/hlint-2.1.17-ae0b98d92ed036e15c5c3d8506e30faa79e6751fbdf5b13c105857a1806c08f5/share/hlint.yaml
  0.04s TOTAL

Timing Parse
  0.09s ./cabal-install/src/Distribution/Client/Setup.hs
  0.09s TOTAL

Timing Hint
  0.78s Match apply
  0.07s Extensions
  0.07s Import
  0.05s Pattern
  0.03s Bracket
  0.02s List
  0.01s Duplicate
  0.01s Lambda
  0.00s ListRec
  0.01s Other items (7)
  1.04s TOTAL

Timing TOTAL
  1.04s Hint
  0.09s Parse
  0.04s Config
  1.17s TOTAL

Took 1.21s

Interactive use of HLint is still not an option :(

ndmitchell commented 4 years ago

The two that are most impacted are the ones where Uniplate optimisations make the biggest differences. It's possible we've fallen off the Uniplate optimisations, which would explain the divergence. Worth checking. #712 might be the thing to investigate - turning on the Uniplate speed checker.

ndmitchell commented 3 years ago

See https://github.com/ndmitchell/hlint/issues/1151#issuecomment-721886749 for some additional performance measurements. Match again seems an order of magnitude greater than the rest.

mbj commented 2 years ago

@ndmitchell

Precompile the expression matching into a lookup machine rather than a linear traversal over all patterns one-by-one.

So per "Flag matcher" the AST is traversed once right now?

ndmitchell commented 2 years ago

@mbj per AST node, all hints are traversed once is probably a more accurate way of saying it

phadej commented 4 months ago

FWIW, this is till the case, and working on large files is still unacceptably slow:

aeson master % time hlint src/Data/Aeson/Types/ToJSON.hs
No hints
hlint src/Data/Aeson/Types/ToJSON.hs  6,25s user 0,21s system 99% cpu 6,474 total

aeson master % hlint --version
HLint v3.8, (C) Neil Mitchell 2006-2024

It takes less time to compile that file with optimisations.

phadej commented 4 months ago

In fact, it seems hlint got slower (I think I have the same machine still as when I opened this issue):

% time hlint cabal-install/src/Distribution/Client/Setup.hs
No hints
hlint cabal-install/src/Distribution/Client/Setup.hs  5,56s user 0,10s system 100% cpu 5,655 total

(to be fair, that file has grown a bit).