haskell / happy

The Happy parser generator for Haskell
Other
291 stars 84 forks source link

Should we make the `happy-*` packages public library components of `happy`? #288

Closed sgraf812 closed 2 months ago

sgraf812 commented 2 months ago

Cabal 3.0 introduced the library:visibility field. Should we use it to define the single-library-component happy-* packages as "sublibraries"? Depending on such a library is exemplified here: https://github.com/haskell/cabal/issues/9480#issuecomment-1826883582. For us it would lead to names such as happy:tabular. Defining such public sublibraries is exemplified here: https://github.com/BlockScope/plugins-for-blobs/blob/develop/plugins-for-blobs.cabal. Note the visibility: public fields.

Pros:

Cons:

Given that

  1. we already decided to lock into the GHC ecosystem
  2. cabal-install is easily buildable with older GHCs
  3. older happy versions will continue to build with ancient GHCs and accept the same .y-file syntax,

I don't think requiring Cabal 3.0 is much of a drawback. On the other hand I do see the real costs associated with release management if we stick to one executable package and 5 library packages, all of which need to keep their versions in sync etc.

This decision has potential to lock us in for quite a long time, as happy-tabular is a library name (technically a package name which implicitly denotes the default library) than happy:tabular. I would like to have the Go from @int-index, @Ericson2314 and @andreasabel on this.

int-index commented 2 months ago

How well do stack and nix support visibility: public?

Since you point out the difference between happy-tabular and happy:tabular, how would nixpkgs deal with this? What do we get instead of haskellPackages.happy-tabular?

int-index commented 2 months ago

Here are the only packages I could find on Hackage that use visibility: public in their .cabal files:

Out of these packages, I found only saturn on Stackage:

Its public libraries are saturn:unstable and saturn:spec. I'm not entirely sure how exactly one would work with those.

By the way, could anyone remind me why we decided to do a package a split at all? Why not have a single library happy-lib?

sgraf812 commented 2 months ago

Thanks for pointing out the issues wrt. stack. I had not thought about that. It would indeed be problematic if stack could no longer build happy, but that doesn't appear to be the case, right? After all, saturn is a multi-public-component package.

By the way, could anyone remind me why we decided to do a package a split at all? Why not have a single library happy-lib?

The reason for the split is that we wanted to be able to publish happy-rad. That would work just as well with a single happy-lib. The reason for having multiple libraries seems to be "Let's decompose as much as possible, so that clients only need to compile what they need". IMO, this sentiment ignores the very real cost of maintaining these separate libraries. Plus, there is no real added benefit because happy does not have a lot of transient dependencies; the transient deps of happy-grammar are largely the same as for happy-backend-lalr.

In light of that, I emphatically agree with you: Let's just have a single, public library component happy!

phadej commented 2 months ago

FYI, cabal-install solver doesn't do per-component solving. If package has multiple components, cabal-install will solve for all of these components, even if it needs (and will only build) a single one. E.g. even if you have an old style package with a library and executable (i.e. two components), even if you only depend on the library, the executable's dependencies still need to be satisfied. They will not be built, but they may affect the install plan (and in worst case make the dependency problem unsatisfiable).

That's not an issue for happy, as the there aren't many (any?) unique dependencies in any components, but still a limitation good to be aware off.

sgraf812 commented 2 months ago

Thanks for pointing that out. Indeed that means there is no real advantage to using multiple library components vs. putting everything into a single library. I think it would be worthwhile to pursue the latter solution then.

Ericson2314 commented 2 months ago

I think we should do this. This purpose of feature is exactly our-case: enforcing modularity between software components that are nonetheless versioned together.

If this feature is not working for us, that is a bug that should be supported upstream.

Conversely, if folks is afraid of unforeseen consequences and afraid to try it out, that is a serious ecosystem problem (we lack confidence in other groups' investments, and thus everything is going to waste) that should be escalated to the HF.

Bodigrim commented 2 months ago

Out of these packages, I found only saturn on Stackage:

* https://www.stackage.org/nightly-2024-09-13/package/saturn-1.0.0.5

Its public libraries are saturn:unstable and saturn:spec. I'm not entirely sure how exactly one would work with those.

Both Hackage and Stackage are mum about these components and their contents:

I think this is a clear evidence that the ecosystem is nowhere ready for public library components.

Bodigrim commented 2 months ago

If I were you, I'd merge all components into the same Cabal file as internal (non-public) sublibraries. Then add the main library which is just an empty shell re-exporting all modules of internal libraries (reexported-modules). The benefits are:

Ericson2314 commented 2 months ago

Thanks @Bodigrim! That's a very nice compromise that keeps things moving / avoids a Postel's law deadlock, and also avoids foot-guns of incomplete features. Really about as good as we can expect given that Hackage support.

phadej commented 2 months ago

Hackage (or rather haddock?) doesn't have support for reexported-modules either. See https://hackage.haskell.org/package/Cabal re-exporting modules from Cabal-syntax. Luckily, Cabal-syntax is a public library, so you can lookup stuff there.

Ericson2314 commented 2 months ago

@phadej On the flip side, if Cabal can use reexported-modules: in production and the universe didn't come falling down, then even if there are unfortunate UI errors, I'd consider it good enough.

Bodigrim commented 2 months ago

Hackage (or rather haddock?) doesn't have support for reexported-modules either. See https://hackage.haskell.org/package/Cabal re-exporting modules from Cabal-syntax.

Not quite: Haddock ignores reexported-modules when they belong to another package. But it works fine if you reexport modules from your own internal sublibrary. For instance, see https://hackage.haskell.org/package/tar

Ericson2314 commented 2 months ago

@Bodigrim I suppose we can do the same reexporting trick from a public sub-library, and it should still work just as well as a compat shim for anything that doesn't support public sub-libraries but does support private sub-libraries?

Bodigrim commented 2 months ago

I guess so, yes. But I'd recommend starting conservatively. Bear in mind that shuffling public sublibraries around should be PVP compliant: anytime you change their API or composition, a major version of entire happy should be bumped. Is the structure of sublibraries mature enough? It would be a pity to bump the major version of happy every now and then without a strong reason.

(I'm personally skeptical about public sublibraries in general, they strike me as a wrong design, so take my opinion with a grain of salt)

Ericson2314 commented 2 months ago

Yeah that's a fair point. Really I'd like the executable to versioned separately from all libraries, split or combined, for that same reason.

phadej commented 2 months ago

@Bodigrim

Hackage (or rather haddock?) doesn't have support for reexported-modules either. See https://hackage.haskell.org/package/Cabal re-exporting modules from Cabal-syntax.

Not quite: Haddock ignores reexported-modules when they belong to another package. But it works fine if you reexport modules from your own internal sublibrary. For instance, see https://hackage.haskell.org/package/tar

Interesting. That seems to be a happy coincidence but still not work properly. I tried having multiple sub-libraries and re-exporting modules from those (then only modules from one sublibrary where in the haddock tarball), or having any module in the main library in addition (then only modules from the main library where in haddock tarball).

So, nope, it doesn't work fine even when you re-export modules from your own internal sublibrary.

EDIT: there is also https://hackage.haskell.org/package/tar-0.6.3.0/docs/Codec-Archive-Tar-Index-Utils.html which is also shown in search and hoogle output. Not bad, but that's definitely a bug.

sgraf812 commented 2 months ago

The suggestion of @Bodigrim is indeed incredibly helpful and worthwhile in finding a compromise that all maintainers find agreeable. That is, have a public umbrella library happy:lib (or happy-lib, see below) re-expose modules from private sublibraries. The hackage ergonomics of doing so seem acceptable, as exemplified by the tar library.

There is a separate issue of whether we want to couple the versioning of happy:exe and happy:lib. Increasingly I'm tempted to think that we do not want that. For one, it's tedious to remember in discussions to distinguish between happy:exe and happy:lib, as you can tell from the Cabal-the-library vs. cabal-install situation.

More seriously, while a version bump to 2.0 for happy:exe is warranted because we drop support for non-GHC compilers, I do not think that in the future we will see many version bumps to happy:exe that are as drastic. On the other hand, I envision happy:lib to change swiftly and dramatically, (have rather bad haddocks for lack of resources), and offer much shorter support cycles (that is, we'll maintain only the most recent version). For example, the addition of the catch mechanism for resumptive parsing described in https://github.com/haskell/happy/pull/272 will need a major version bump of happy:lib, but only a minor bump (perhaps even just patch) for happy:exe. (Do note that a revamped version of that patch will become a reality; our GSoC student @Kariiem is doing an excellent job at applying it to GHC.)

In light of that, I would like to propose to have a separate package happy-lib, consisting of just the public library shim that re-exports its private sublibraries, and version/maintain it separately from happy:exe. This incurs some maintenance overhead, but I hope it is justified. Note that there is qualitative difference of support when depending on happy vs. happy-lib: The former is supposed to be very stable, while the latter is expected to be lacking in documentation and support cycles.

Edit: I opened PR #297 for the happy-lib+happy design.

sgraf812 commented 2 months ago

Following the merge of #299, I tried multiple times to upload haddocks for the happy-lib+internal sublibraries solution. In doing so, I was only ever able to get haddocks for one of the sublibraries (often backend-glr or backend-lalr, because they had been processed last).

I've meanwhile published a package candidate where I simply expose modules rather than reexport them. Alas, it appears I need to completely remove the sublibraries from the cabal file in order to see documentation for all modules.

So it appears there is a real cost associated with using sublibraries, even if only internal: We compromise on uploaded haddocks. Perhaps hackage's haddock crawler can do better, but that seems at least annoying.

I'm not even sure whether this is an issue of haddock or hackage.

sgraf812 commented 2 months ago

I'm pretty sure that haddock is the culprit. It produces one haddock path for each of the sublibraries, but none for the main library which reexports a subset. I attached one of the doc.tar.gz tarballs that cabal haddock --haddock-for-hackage --enable-doc --haddock-options=--quickjump produces as a result. Note that it arbitrarily picks the backend-lalr.

happy-lib-2.0-docs.tar.gz

This also explains why it hasn't been an issue for tar: It only has a single sublibrary where it properly displays just the reexports.

Ericson2314 commented 2 months ago

Let's create a haddock issue for this stuff (I guess that's now a GHC issue?)

sgraf812 commented 2 months ago

Opened here: https://gitlab.haskell.org/ghc/ghc/-/issues/25270

Ericson2314 commented 2 months ago

And this Cabal issue: https://github.com/haskell/cabal/issues/10368