haskell / happy

The Happy parser generator for Haskell
Other
276 stars 84 forks source link

Consider a recusive ascent/descent backend #167

Open sgraf812 opened 4 years ago

sgraf812 commented 4 years ago

See https://github.com/djspiewak/scala-bison, its tool paper and the original paper on recursive ascent/descent.

There's also the Typed LR paper, which maybe is a preferrable embedding compared to plain recursive ascent-descent.

I expect that with a bit of tuning, such a parser can outperform happy's current table-driven backend because GHC will be able to inline functions.

sgraf812 commented 4 years ago

FWIW, I now have a student tackling this for his bachelor's thesis. So eventually there will be at least one PR.

sgraf812 commented 3 years ago

(And half a year later, there's #174 🎉 )

andreasabel commented 3 years ago

@sgraf812 @simonmar Basic question: why should this go into happy? Can't you make a separate parser generator that is compatible with happy, i.e., reads .y files?

Ideally, of course, it is great to have added functionality.

From a pragmatical side, there are down sides:

Note that we already have a broken backend in happy: --glr with its original contributors vanishing after completing their thesis work.

I think these aspects should be taken into account when adding functionality to happy, in particular when it is substantial.

Sorry for playing the devil's advocate here.

My reservation comes from experiences with maintaining the BNFC front-end to parser generators. This tool has suffered from feature bloat from eager contributors who then moved on to other projects, already after 5 years not feeling responsible to maintain their components. In the end, I had to weed out these components. That is undesirable, because you never know whether some user out there depends on these features.

I think all of happy should be in good shape before we accept new components. And new components should come with a signed maintenance pledge for at least 5 years.

sgraf812 commented 3 years ago

Those are very good points. I agree, happy is not actually in a shape that accommodates any more extension. But what I like about #174 is that it is a drop-in replacement for happy.

In an ideal world, it would make sense to split happy into three components:

  1. happy-frontend: Parsing y-Files (including GHC-specific stuff), return an AST representing the grammar+entrypoints
  2. happy-middleend: grammar analysis that generates LALR(1) states, producing a (potentially ambiguous) ActionTable and GotoTable
  3. happy-codegen: Different codegen strategies, like table-based LALR and GLR.

I think #149 wants to replace (1) with a TH-based frontend (maybe in the form of "Staged Selective Parser Combinators"), while #174 wants to replace (3) with an RAD-based one. "Staged Selective Parser Combinators" also describes some Grammar optimisation passes that might qualify as an extension to (2). (It's also certainly possible to want to replace (2) with an LL(k)-based middle-end, although it's not a need of mine.)

If we can make the plumbing of the vanilla happy executable as trivial as possible (e.g. by also putting the CLI interface into the library component), those changes would be pretty simple and easily combinable into another drop-in replacement for happy. Arguably, happy would have 3 or even 4 package components instead of just 1 now, but that neither is a breaking change nor should come at a huge cost in compile-time.

Ericson2314 commented 3 years ago

Yes splitting happy along those lines is very much something I'm interested in. As always, Happy can pave the way for the architectural changes GHC itself should make :D.

sgraf812 commented 3 years ago

The question is if the maintainers of happy would be OK with the decomposition into multiple library sub-components, which undoubtly will introduce a bit of a hassle wrt. release management (multiple libraries to bump, etc.).

I'd be open to contributing a patch that does the separation, but I'm rather reluctant to do so if it won't end up being merged as it would quickly bitrot.

I'm vary of back-compat to older GHC/Cabal versions a bit. Worst case there would be two cabal project file (hierarchies) to be maintained: One with the library components and one where it's all one exe (as it is the case now).

Ericson2314 commented 3 years ago

@sgraf812 I am sorry I didn't see this eariier; I would very much be a fan of such a split! I think @int-index would too, but he should confirm. I am not sure what @simonmar thinks, he might want to weigh in too.

I opened https://github.com/simonmar/happy/issues/187 to track these great (in my opinion) refactors separately from the recursive ascent/descent backend question.

exposedcranium commented 1 week ago

Can someone explain what happened to the RA backend PR and the modularisation effort? They both seem to have been closed without progress several years ago...

@sgraf812

sgraf812 commented 4 days ago

The modularisation effort has been merged here: https://github.com/haskell/happy/pull/191

Since then, happy consists of separate packages. There has been an effort to publish a new happy version (2.0) with this new module structure; alas, that release has been delayed for 3 years for unrelated reasons (#215, #238, #255, #274) that meanwhile have been fixed (#277).

The RA backend was rebased wrt. happy master (which includes the new module structure) and can be found here: https://github.com/knothed/happy-rad.

I started preparing a 2.0 release a while back, but higher priorities intervened. Perhaps I can get back to it later this week.

sgraf812 commented 17 hours ago

Here's a collection of open issues that we need to resolve before the release: https://github.com/haskell/happy/milestone/2

Most of them have PRs already, but https://github.com/haskell/happy/issues/288 does not and needs a bit of coordination with / opinion of the other maintainers.