Closed knothed closed 2 years ago
@knothed Still trying to find the time to properly sit down and review this, but I wanted to reaffirm I'm very happy to see this PR!
@Ericson2314 perfect, I‘m quite happily looking forward to hearing your opinion!
I didn't review much code, because I trust you on not having modified the core happy logic. Plus, the tests seem to pass, so your refactoring seems to be successful.
I quite like the new package organisation, but ultimately it's up to the maintainers to decide.
Edit: Although, I wonder: Did CI really run the tests? I can see no output difference between https://travis-ci.org/github/simonmar/happy/jobs/774274988 (where they should run) and https://travis-ci.org/github/simonmar/happy/jobs/774274987 (where they shouldn't run).
For CI maybe try GitHub workflows. See Alex, which has made the switch, though I think is prefer a manually written one like I attempted in https://github.com/simonmar/alex/pull/190 as last I checked haskell-ci didn't support macOS with GitHub workflows yet.
Ah just merge in https://github.com/simonmar/happy/pull/196 and get it working with what you got. Hopefully we can merge that one soon and fix CI so this can be safe to merge.
Edit: Although, I wonder: Did CI really run the tests? I can see no output difference between https://travis-ci.org/github/simonmar/happy/jobs/774274988 (where they should run) and https://travis-ci.org/github/simonmar/happy/jobs/774274987 (where they shouldn't run).
Yes. In the second link, the tests have been run normally, while in the first link, make sdist-test
was performed. Compare e.g. the number of lines from the output.
Ah just merge in #196 and get it working with what you got
Hmm... Isn't GHA-based CI an entirely different issue? As I see it, Travis CI seems to work with this PR: https://travis-ci.org/github/simonmar/happy/builds/774274977. And it indeed runs the tests.
@knothed Oh when I clicked "all checks have passed" Travis was not there, so I assumed it was broken. Nevermind, then.
OK I would like this to get merged soon :). It is very good and has sat open too long.
A few more things
happy
itself in packages/
too? This will make filtering the sources per package from the repo easier, because every package gets its own subtree.isBootstrapped
to leak out whether to do the attribute grammar tests.If you check the "maintainers can push to PR" box I can help fix conflicts.
OK I would like this to get merged soon :). It is very good and has sat open too long.
Nice, I'd also like that :)
- Can we put
happy
itself inpackages/
too? This will make filtering the sources per package from the repo easier, because every package gets its own subtree.
Yes we could do that. This would however require all of happy
s data-files to be inside this subdirectory. These include the readme, changelog, makefile etc., which would be much better placed in the top directory instead of being inside packages/happy
.
We could probably solve this using links? But I don't know whether that is cross-system-proof.
- I would be open to giving the swappable packages names based on the algorithm they currently use. e.g. maybe we would switch the default backend from this one to a different one at some point :).
Sounds good - what would you propose?
- Can the attribute grammar stuff leverage this some how? I see you have
isBootstrapped
to leak out whether to do the attribute grammar tests.
Bootstrapping is a frontend-specific thing, and the frontend exposes isBootstrapped
. The other packages don't care about bootstrapping.
Using happy
with attribute grammars requires the bootstrapped happy
however which is why the attribute grammar tests are listed separately inside Tests.hs
.
We could probably solve this using links? But I don't know whether that is cross-system-proof.
I'm told windows support for symlinks is continuously improving. And I think git had some tricks even before that, so that's fine with me.
- I would be open to giving the swappable packages names based on the algorithm they currently use. e.g. maybe we would switch the default backend from this one to a different one at some point :).
Sounds good - what would you propose?
I guess first I would need to understand how/why your happy rad still uses regular happy-backend
.
Well isBootstrapped
should be called something like supportsAttributeGrammars
for starters. The former is an implementation detail, the latter isn't.
I guess first I would need to understand how/why your happy rad still uses regular
happy-backend
.
My rad-backend
is independent of happy-backend
. The executable happy-rad
however does include regular happy-backend
so the user can decide which backend they wanna use.
Should we remove the TODO file? It seems outdated.
I guess first I would need to understand how/why your happy rad still uses regular
happy-backend
.My
rad-backend
is independent ofhappy-backend
. The executablehappy-rad
however does include regularhappy-backend
so the user can decide which backend they wanna use.
Oh! So it could be skipped to force them into the rad backend? That's great news.
Could you make a happy-backend-glr
? And make it optional just like you made happy-rad
support two backends?
Perhaps that is a better splitting than whatever I was thinking of with the attribute grammars :).
Yes, I‘ve also thought about splitting the GLR backend off. I‘ll do that.
It could however be that the GLR backend has a dependency to the normal backend, as it uses, I think, some of the normal backend logic. But that shouldn’t be a problem – the normal backend (obviously) doesn’t depend on the GLR backend.
Oh! So it could be skipped to force them into the rad backend? That's great news.
Exactly!
https://github.com/simonmar/happy/blob/1ad42c11b24ca6bc67720489a21fc7e0aa23dcfe/packages/backend/src/Happy/Backend/ProduceGLRCode.lhs actually looks like it just uses core!
It seems the attribute grammar does have a backend component, after all, or at least there is some ugly string concatenation in the frontend mangler that look like it.
But this is no fault of the current PR, so I'm happy to deal with it post merge :)
cabal check
wants license and maintainer fields for each package. I've just copied most fields from the original happy package, if that's okay.
I've also added some imperfect descriptions to the packages, maybe you could take a quick look at them.
GitHub actions works now (https://github.com/piknotech/happy/actions). I've manually updated haskell-ci.yml
and haskell-ci.patch
as haskell-ci
doesn't seem work on Mac with GitHub workflows, so someone should probably test the haskell-ci.yml
file generation again.
Also, could you approve the GitHub actions to run here in this repo? As a first-time contributor I need a maintainer to approve running workflows.
cabal check
wants license and maintainer fields for each package. I've just copied most fields from the original happy package, if that's okay.
Maybe this would be an opportunity to name the new maintainers, as currently only Simon Marlow is present in the cabal files.
@Ericson2314 The GLR-backend is now split off into a separate package. It is however (and has previously not been) covered by any tests.
Are you happy with the cabal files?
happy-rad
now even features three backends at once: normal, glr and rad. It calls the correct one depending on whether the --glr
or --rad
flags are set.
This multi-package option parsing can be seen here. Could be helpful for future package developers.
I love the idea, but this is impossible to review. Can you split it into several parts?
You could first factor out happy-core
into its own package as a separate PR, then happy-middleend
, then happy-backend
, etc, each in its own incremental PR.
Also, I disagree a little with the way you have split this into packages, in particular regarding happy-core
and happy-middleend
. Command-line options are a part of the command-line interface and definitely do not belong to happy-core
, whereas routines to transform Grammar
into (ActionTable, GotoTable, [Lr1State], [Int])
can be easily put into the same package as Grammar
, ActionTable
, and GotoTable
.
So I think that:
happy-cli
instead of happy-core
happy-middleend
should be merged into happy-core
, and the term “middleend” should be avoided altogether (I’d rename runMiddleend
to buildTables
or something)@int-index I do not like the principle of putting CLI stuff everywhere either, but it does seem awfully convenient for whipping up exes like @knothed's happy-rad
it does seem awfully convenient for whipping up exes like @knothed's happy-rad
True, hence the suggestion to put CLI utilities into their own package happy-cli
it does seem awfully convenient for whipping up exes like @knothed's happy-rad
True, hence the suggestion to put CLI utilities into their own package
happy-cli
So it's OK if happy-frontend and happy-backend depend on those?
So it's OK if happy-frontend and happy-backend depend on those?
No. What is there in packages/core/src/Happy/Core/OptionParsing.hs
(in the current patch) that the frontend or the backend would use?
Oh, I see packages/*/src/Happy/*/CLI.hs
doesn't actually use that stuff.
edit Are you OK with that stuff being there since it is still morally CLI stuff even if it doesn't use that module?
Anyways, I like the happy-cli
idea, it doesn't resolve the tensions but it is an improvement.
I would like to keep middleend
separate, however. It is a bad name, but I do think it is useful to separating a yet-to-be-named middle stage of compilation from the stuff everything needs.
Ultimately, I don't think happy is written with modularity in mind, so it's better to do some split, erring on the side of more granularity, and then refactor things to be sane after. (I do not want to do a release right after this PR is merged.)
Are you OK with that stuff being there since it is still morally CLI stuff even if it doesn't use that module?
Are you referring to packages/*/src/Happy/*/CLI.hs
? No, I don’t think it belongs there either. But it’s more tolerable, because there’s no immediate use case that would suggest doing otherwise.
I plan to use happy-core
as part of my TTH-based EDSL, and it definitely has nothing to do with CLI. On the other hand, something like happy-backend
would only be used without CLI if we planned to support a Happy GUI or something, which is less likely. (But if it happens – we will need to think about moving CLI stuff out).
I do think it is useful to separating a yet-to-be-named middle stage of compilation from the stuff everything needs.
Okay, but let’s think about concrete use cases. What library or app would ever require Grammar
and ActionTable
/ GotoTable
but not the function to transform from one to another?
We're on same page for the CLI stuff now.
For the fate of middleend, I thought one could convert a Grammar to the tables and then dispense with the grammar, but the imports seem to indicate otherwise. That pulls the foundation out from underneath that the middle stuff is a pipeline stage at all.
(To recap, the structure I generally like would be ASTs/IRs 0..n+1, and transformations 0..n, depending online on the input and output ASTs/IRs.)
FWIW, initially we thought about a split of just frontend > middleend > backend and middleend was all about turning Grammar descriptions into ActionTables. But then we discovered that all of them need CLI functionality and share types. It would've been awkward to put them all in frontend, so we wrote core to put the shared stuff there.
Now it might seem that middleend doesn't do much and we could move it into core, but then frontend and middleend implicitly depend on how we translate Grammars to ActionTables... Note there might be other ways to write a buildTables
function. Maybe that is OK?
CLI stuff ended up in core because we didn't want to introduce yet another package, but happy-cli is ultimately the better solution.
OK it seems all the backends really do use the original Grammar
meaningfully.
If that is something that is reasonable to fix (by instead passing in the residual information that is not duplicate with that in tables) I would have happy-grammar
and happy-tabular
together with happy-cli
replace happy-core
.
The final breakdown would be:
Tables
-> happy-tabular
GenUtils
-> happy-grammar
, but should be liquidated
Grammar
-> happy-grammar
NameSet
-> inline to happy-middleend
, redefine in happy-rad
OptionParsing
-> happy-cli
Backends would depend on happy-grammar
and happy-tabular
until that situation is rectified.
If, on the other hand, it is not reasonable to keep Grammar
away from the backends, then I would further combine happy-middleend
into happy-tabular
. (The backends would morally depend on how the tables were projected, but the fronted wouldn't depend on the tables.) That is closer to what @int-index said, except the happy-grammar
is still separate for the frontend to depend on without happy-tabular
.
How does that sound to everyone? Take your time responding, I need to go to bed :).
I'm fine with the division @int-index and @Ericson2314 proposed - splitting happy-core
seems sensible.
OK it seems all the backends really do use the original Grammar meaningfully.
They do, and I don't think it's sensible to try changing that by introducing another IR type that the backends should then use instead of Grammar
. There is lots of information inside Grammar
that is used by code-gen that is not represented in the tables, and all of it is required for a correct code-gen. The only fields in Grammar
which are not backend-relevant are expect
and priorities
.
This would mean that happy-middleend
would be merged into happy-tabular
, which is okay. Users who want to write additional optimisation passes based on the generated tables could still do so, they don't need happy-middleend
for that.
All of happy-middleends
CLI-options would just be merged into the frontend, and the frontend would perform the IO-tasks that the middleend did so far: info file generation and debug dumping.
What would you suggest to improve the situation with the CLI? I do think that it is sensible for each package to expose its own CLI-options. When the CLI-options are not per-package, but central (i.e. defined by the executable itself), we lose the possibility to easily combine packages without redefining the CLI handling every single time.
A possibility would be to split happy-package
into itself plus happy-package-cli
, but that immediately leads to package flooding. I believe it's best as it currently is.
Note that a package can always be used without using its CLI options, so a GUI could e.g. just call the backend by creating BackendArgs
, without CLI interplay.
Also, I'll be on vacation for the next 4 weeks from now on, so all of you feel free to perform some patches to this PR yourself, given enough spare time and/or dedication. I'll still answer on comments, I just can't write any code. 😄
All of happy-middleends CLI-options would just be merged into the frontend, and the frontend would perform the IO-tasks that the middleend did so far: info file generation and debug dumping.
I wouldn't do that, because then happy-frontend gains the happy-tabular dep. I would just leave those in happy-tabular for now, and we can move them out later.
What would you suggest to improve the situation with the CLI?
Fine as is. The individual packages don't use that CLI module that was in core, so we side-step it for now. When we do the TH edsl we can consider doing the -cli package explosion as @int-index implied (as I interpreted, at least).
Not a productive comment, but I just wanted to say that I'm really happy to see this, as I was looking for a happy parser that I could use to partially automate a labor-intensive conversion from GHC's Haskell grammar. This looks like just what I need. I'm quite eager to see it merged.
@cdsmith On the contrary it's great motivation to here this stuff doesn't just feel like a good idea, but has concrete immanent use-cases something laudable like code world (I presume :))!
Welcome back, @knothed :). @int-index what do you think of my last proposal? I was actually expecting us to decide on something first before @knothed implemented it — hopefully it's more too many more hoops to jump through!
Yeah, it’s fine by me.
But as I said, this current PR changes too much at once, I find it difficult to review. Could we do it gradually? For example, start by separating out happy-grammar
into its own package.
Perfect, I’ll try splitting up the MR as sensibly as possible. :)
Would you like each MR to actually build and work for itself? That would be a bit challenging considering the changes to the parsing architecture.
Would you like each MR to actually build and work for itself?
Yes.
That would be a bit challenging considering the changes to the parsing architecture.
All the more reason to do things separately. Consider starting with changes to the parsing architecture before moving things around -- that would make it easier to review the new architecture as a proper diff
I suppose the first PR can establish the subdirectory regime (the moves will not clutter the diff), and then the subsequent ones are more simple.
Hmm, wouldn’t moving the source files into different directories break the single-package happy
, which expects them in src/
?
I just mean the first PR can adopt the directory structure of this PR, but with just the two packages happy and happy-grammar.
Following https://github.com/simonmar/happy/issues/167#issuecomment-780591344 and #187, this PR aims to split up
happy
into several components. Two main goals are accomplished:happy
like frontend, middleend and backend.happy
or to create and distribute new, possibly experimental,happy
-features and -components (like a TemplateHaskell-frontend or a recursive ascent-descent-backend).Ideally,
happy
programmers should be able to combinehappy
packages almost arbitrarily: Each frontend should, in theory, be able to work with each middleend, which in turn should work with each backend.To achieve this goal, we (@sgraf812 and me) have decided to split
happy
into six packages – five libraries and one executable – as follows:happy-frontend
is responsible for reading .y-files and parsing them into theGrammar
datatype.happy-middleend
transforms theGrammar
into anActionTable
and aGotoTable
.happy-backend
takes the these tables and generates template-based Haskell code.happy-core
defines the interfaces (IRs) between the frontend, middleend and backend packages. Also, it contains core utilities like option parsing.happy-test
hosts the test files and contains logic to execute the test suite.happy
itself just puts the above packages together and contains little additional logic.Why six packages?
We want users to be able to combine different frontend-, middleend- and backend-packages together. Therefore,
happy-core
is vital as it defines data types that every such package should be able to work with; in particularGrammar
,ActionTable
andGotoTable
.We'll look at
happy-test
shortly.How does a package look like?
The three packages
happy-frontend
,happy-middleend
andhappy-backend
are all built similarly: they have a publicrunFrontend
/runMiddleend
/runBackend
function which performs the package's main task. Also, they have a common interface for providing and parsing command line arguments.Let's take a look at
happy–backend
(packages/backend). It has the following exposed components:A main entry point function,
runBackend
, which performs the code-gen:BackendArgs
are user-provided arguments, whileGrammar
,ActionTable
andGotoTable
come from the frontend and middleend (as results ofrunFrontend
andrunMiddleend
).happy
could callrunBackend
with arbitraryBackendArgs
. There is no requirement for a command line interface per se.But ultimately, we do want to perform
happy
via a CLI, meaning we want to createBackendArgs
from parsing the command line arguments. Just as previously inhappy
, we define the CLI arguments using GetOpt, but on package-level:So, each package defines its own CLI
Flags
andoptions
and, in addition, aparseFlags
function which converts these CLI flags intoPackageArgs
.Then, option parsing takes place as follows:
happy
executable sticks theoptions
from each package together and callsgetOpt
.Flag
s are then used to callparseFlags
and execute each package's respective main entry point function:runFrontend
,runMiddleend
andrunBackend
.In this way, option parsing is fully modularized such that new packages can define and use their own CLI options.
We just have to pay attention to one thing: with getOpt, it is not possible to have multiple options with the same option character or option string. This means, if there are two different packages which both define an
-o
option, these cannot be used together in the samehappy
executable. Package authors have to watch out for such option collisions.Example: happy-rad
As an example for how this modularization can be used to create a new executable, let's take a look at happy-rad.
happy-rad
extendshappy
by a recursive ascent-descent (RAD) backend. It contains two packages:rad-backend
: a backend which produces RAD code. This package is completely independent ofhappy-backend
. Still, it works with the same data types (Grammar
,ActionTable
,GotoTable
) ashappy-backend
.happy-rad
: it depends (a.o.) onhappy-frontend
,happy-middleend
,happy-backend
(which are remote packages) andrad-backend
(local package). First, frontend and middleend are executed normally. Then, depending on whether the--rad
option is set, eitherrad-backend
orhappy-backend
is invoked.As
happy-frontend
,happy-middleend
andhappy-backend
are (will be) remote packages, we can just reuse their functionality. In addition, we get testing for free:testing happy and happy-rad
How does the
happy-test
package help package authors in testing their packages?We've moved all of the existing 26 test grammars (e.g. ParGF.y) into
happy-test
.happy-test
exposes the following function:In their test suite, the package author can call
test
and customize the testing procedure: they can specify which of the 26 grammars should or should not be tested and which argument combinations their executable should be tested with. In addition, they can specify further, custom, testing grammars.For example,
happy-rad
could provide[--rad]
for the set of arguments it should be tested with, while normalhappy
could provide[-a, -ag, -cg, -acg]
.happy-test
can also do something different: The functionality ofmake sdist-test
, which was previously in the Makefile, has now moved intohappy-test
. This way, every extension author can test their sdists. What doesmake sdist-test
do?cabal sdist all
to create an sdist of all local packagesMore
For the end user of
happy
, not much changes: the CLI stays identical, the functionality remains the same. Just the compile time ofhappy
increases a bit.Template Files
As the template files are backend-specific, these are now data-files of
happy-backend
, not ofhappy
itself. Just like the tests, which are data-files ofhappy-test
instead ofhappy
itself.Bootstrapping
Bootstrapping is a process which is purely frontend-specific. Therefore, the bootstrap flag only exists in
happy-frontend
, but still exists. A next step could be to outsource the bootstrapping parser into its own package, as described in https://github.com/simonmar/happy/issues/187#issuecomment-825564616.CI
Travis and appveyor used
make sdist; make sdist-test-only
. We have replaced this with our newmake sdist-test
.make sdist-test
currently doesn't work on MinGW.