haskell / happy

The Happy parser generator for Haskell
Other
276 stars 84 forks source link

GHC takes a very long time to compile parser generated by Happy 1.19.8 compared to Happy 1.19.5 #109

Closed athas closed 2 days ago

athas commented 6 years ago

I maintain a compiler that uses a reasonably sized parser written with Happy: https://github.com/diku-dk/futhark/blob/e3f5344ac74f80a7d74ee57cbff27597b86eb545/src/Language/Futhark/Parser/Parser.y

There is nothing fancy going on, and I believe that I have type signatures for all productions. I do use parameterized productions a little bit, though. Unfortunately, while the parser generated by Happy 1.19.5 compiles with GHC in a dozen seconds, the parser generated by Happy 1.19.8 takes upward of twenty minutes. I have identified these hole-y type signatures as the culprit:

#if __GLASGOW_HASKELL__ >= 710
happyReduce_6 :: () => Happy_GHC_Exts.Int# -> L Token -> Happy_GHC_Exts.Int# -> Happy_IntList -> HappyStk (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) -> ParserMonad (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _)
#endif

If I remove these type signatures, then compile times drop down to a dozen seconds again. Perhaps GHC has some quadratic behaviour in the implementation of typed holes, or perhaps Happy shouldn't be generating these - I'm not sure. But given the relative simplicity of my grammar, I don't think I'll be the only one encountering this. Is there perhaps a workaround, or am I doing something stupid?

(Happy 1.19.6 and 1.19.7 produce code that does not compile for this parser, likely due to #94.)

athas commented 6 years ago

Significantly reducing (but not totally eliminating) the use of parameterized productions seems to function as a workaround.

harpocrates commented 6 years ago

Unless most of GHC's work here is parsing (something I very much doubt, given that we are talking about a huge slowdown), this sounds like GHC could be at fault here - we are giving it strictly more information about the types, yet it is taking much longer.

athas commented 6 years ago

Yes, I think you are correct. Probably the implementation of typed holes was not written to cope with large quantities of holes.

dpwiz commented 6 years ago

I had this bug too, but then it disappeared mysteriously. Now it is back.

My generated Parser.hs contains 92 functions with 92 arguments and 8.2.1 takes forever to compile. I left if to run for the weekends and it still wasn't done.

UPD: with 8.2.2 it dropped to a dozen-minutes range. Still unacceptable if you want to hack on the parser itself.

simonmar commented 6 years ago

Does anyone have a standalone repro for this bug please?

dpwiz commented 6 years ago

Not exactly minimal and does not resemble the real parser, but has almost same number of type vars used and, I hope, shows the slowdown.

https://github.com/wiz/too-happy

dpwiz commented 6 years ago

I did a round of too-happy builds with different GHC versions.

7.10 takes 20 seconds to build. 8.0, 8.2.1 and 8.2.2 all take 3.5 minutes.

harpocrates commented 6 years ago

@wiz That sounds a look like a GHC regression with respect to partial type signatures. Do you want to report it to the GHC Trac?

dpwiz commented 6 years ago

That makes sense. But, given the time scale of GHC fixes and releases, we wouldn't see the results of the fix for quite a time. If package could provide a workaround for GHC 8.{0,2} this would save us a lot of compilation time until a 8.4-enabled LTS snapshot comes out.

harpocrates commented 6 years ago

@wiz I completely agree. It is going to be difficult to do that without breaking backwards compatibility though (perhaps that's OK) - I spent a long time trying to figure out how to fix https://github.com/simonmar/happy/issues/94 and PartialTypeSignatures were the only way I found to not regress in features (here is the commit: https://github.com/harpocrates/happy/commit/b79dbdbcd1530ffd0916317e08097b4a7f5d0fd0).

I see a couple avenues forward, but I will not be implementing any of them:

Also, if you would like to undertake the last of these (my favorite), getting a fix into GHC 8.4.1 will be tough (the branch has been cut and the release is due in only a couple of weeks).

Note that I just tried this on my working GHC HEAD and performance is even worse... 😞

harpocrates commented 6 years ago

Actually, perhaps there is a mediocre (hopefully temporary) workaround: do not generate signatures when there is an empty monad context (basically, pay the price of partial type signatures only when you want the features of https://github.com/simonmar/happy/pull/49).

harpocrates commented 6 years ago

Reported: https://ghc.haskell.org/trac/ghc/ticket/14683#ticket

harpocrates commented 6 years ago

While producing a test case, I realized this has nothing to do with Happy 1.19.5 or 1.19.8 or partial type signatures - it is just a straight GHC compile time regression. After inlining the data definition of Token from too-happy, here are the compile times of Grammer.hs:

GHC 7.10.3 GHC 8.0.2 GHC 8.2.2 GHC HEAD
Happy 1.19.5 17.37s 127.89s 115.26s 125.44s
Happy 1.19.8 17.71s 129.23s 114.42s 128.42s

So: please ignore everything I've said so far in this thread and refer to the GHC thread (https://ghc.haskell.org/trac/ghc/ticket/14683#ticket).

dpwiz commented 6 years ago

I've noticed some #ifs in generated code containing types with holes and added a hack to redefine __GLASGOW_HASKELL__ to 709 in the parser file to remove those and the slowdown got reduced massively.

{
...
#undef __GLASGOW_HASKELL__
#define __GLASGOW_HASKELL__ 709
}
#if __GLASGOW_HASKELL__ >= 710
happyReduce_1 :: () => Happy_GHC_Exts.Int# -> T.Token -> Happy_GHC_Exts.Int# -> Happy_IntList -> HappyStk (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) -> P (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _)
#endif

However I could not reproduce this hack with the too-happy example because there was no code generated inside #ifs :confused:

#if __GLASGOW_HASKELL__ >= 710
#endif

Snapshot (lts-10.3) and happy(1.19.8) versions used were the same.

harpocrates commented 6 years ago

@wiz Do you have a commit of your project you could point me to where you are seeing this longer compilation times? I've tried compiling master (of Futhark) with Happy 1.19.8 and a recent GHC and compile times seemed fine...

dpwiz commented 6 years ago

@harpocrates The code haven't changed. We were forced to move to 8.x series and after patching some imports everything but parser compilation time was fine.

nomeata commented 6 years ago

If these slow-downs are really related to the type holes in certain versions of GHC, then maybe happy should not add them unless it is a version of GHC known to not have that regression?

harpocrates commented 6 years ago

@nomeata Do you have an example of a slowdown caused by type-holes? If so, I would like to distill it to a test case for a GHC bug. I haven't been able to replicate these slowdowns.

nomeata commented 6 years ago

Yes; see the commit https://github.com/antalsz/hs-to-coq/commit/c538e6a92362d22cf6abf45c8f2746ea938ac35a. Without this hack, GHC-8.0.2 and latest happy, GHC sits there staring at Parsers.hs; with this hack compilation is fast.

harpocrates commented 6 years ago

Thanks @nomeata, I can reproduce now! I've opened https://ghc.haskell.org/trac/ghc/ticket/14766 for this. It looks partial type signatures are an issue. Copying over the table of compile-times from the ticket, I observed for Happy-1.19.8:

GHC 8.0.2 GHC 8.2.2 GHC HEAD
Signatures disabled using CPP hack 24.13s 22.93s 34.05s
With signatures >15m ~13m >15m
andrew-wja commented 4 years ago

Hi all,

Have we seen any movement on this issue? I noticed that https://gitlab.haskell.org/ghc/ghc/issues/14683 has been closed, but I am still having the issue with GHC 8.6.5

I've been working on a standalone parser for the MLIR syntax (https://gitlab.com/arc-compiler/arc2-mlir/blob/master/src/ARC/MLIR/Parser.y) which is now taking upwards of 30 minutes to compile with GHC 8.6.5, making iterating on the parser very painful and time consuming.

andrew-wja commented 4 years ago

Updating this for posterity (since this issue is high on the list of Google results for GHC slowdown compiling Happy-generated code).

After applying the CPP hack I was still getting the same ludicrous compile time with GHC 8.6.5. I decided to just throw up my hands and try the most recent GHC I could find in the hope that the performance issues were fixed there. I'm happy to report that with GHC 8.8.1, the 30 minute compile time for the Happy parser I linked above is reduced to about ten seconds! So GHC 8.8.1 (specifically, the build in the stack nightly resolver right now) does not exhibit this slowdown compiling Happy-generated parsers.

andreasabel commented 3 years ago

Please update the title of this issue to reflect the results of the discussion:

sgraf812 commented 2 days ago

Closing this issue; it is likely just an issue within GHC that long has been resolved. Besides, this issue has been dead for nearly 5 years. If it becomes pressing again, feel free to reopen.