Cabal should perhaps reexport all the modules from Cabal-syntax to reduce breakage

mpickering commented 2 years ago

The changes to split up Cabal and Cabal-syntax cause most packages which use the Cabal library to fail to build (https://gitlab.haskell.org/ghc/head.hackage/-/jobs/944876).

Perhaps it would be worthwhile to make Cabal reexport all the modules in Cabal-syntax to minimise this kind of distruption.

cc @Mikolaj

Mikolaj commented 2 years ago

@patrickdoc, @gbaz, @fgaz, what do you think?

gbaz commented 2 years ago

Hrm. I recall there's one general issue that discourages module reexports:

If package A re-exports package B, then the api surface of package B becomes part of the surface of package A.

So to stick to the PVP, each version of package A should in pin the dependency on package B to within a single minor revision range. This can be a bit of a pain. In our case, it may be ok, given that we control everything.

I don't recall any other things we may need to worry about offhand, but also I am not sure...

fgaz commented 2 years ago

I'm :-1: on this. One of the reasons to separate Cabal-syntax was to decouple the two release processes. We could only reexport those module in the next version as a transitory period, but that would only delay the breakage, wouldn't it?

mpickering commented 2 years ago

I imagine most people who are depending on Cabal do not care at all about the difference between Cabal and Cabal-syntax. If they want to just depend on Cabal-syntax and get more updates then they can modify their programs to do so but otherwise they will be fine just using Cabal with whatever version it depends on.

fgaz commented 2 years ago

There's more discussion about this starting from https://github.com/haskell/cabal/pull/7620#issuecomment-914180574

Mikolaj commented 2 years ago

For reference, here's a significant part of the original motivation: #7559, In particular, it seems re-exporting all of Cabal in Cabal-syntax would be a deal breaker, but I'm not sure about the reverse.

However, another part of the motivation, not mentioned there [edit: @fgaz just linked to it], is permitting people to bundle their own versions of Cabal-syntax that parse their .xml or .tex .cabal file mutations. Could we perhaps work towards a small and general enough API that it fits any wild experiments with .cabal files formats and so can be included in Cabal without stifling progress? Or is exposing the implementation of datatypes required?

@mpickering: in your particular use case(s), which parts of Cabal-syntax would you find convenient to be exported in Cabal?

mpickering commented 2 years ago

My particular concern is unnecessary downstream breakage but if the cost of this change has been evaluated and the breakage is deemed acceptable then I don't have a further opinion.

Mikolaj commented 2 years ago

Any other comments? @fendor? @andreasabel?

I hope users of Cabal the library are going to try the master branch and will let us know if the migration can/should be made easier.

fgaz commented 2 years ago

Maybe we should send a message in the mailing list alerting of this change? We should also do proper release candidates this time.

fendor commented 2 years ago

Looking at hackage reverse deps, we can see that almost 700 packages depend on Cabal. Admittedly, almost 200 have an outdated dependency, but that's still potentially 500 packages that break now.

For comparison, ghc has 250 reverse deps on Hackage. The comparison is not fair at all, since every package can bump their dependencies individually, and does not have to wait for its dependencies to update as well, etc... But my point being, that a lot of packages break.

I think we have the following goals:

Maintainers have to update their dependency only once
Decouple Cabal and Cabal-syntax for good
Make the transition as easy as possible

Since the changes are mostly mechanical, maybe we can introduce a migration tool for cabal files (called from now on cabal-migrate)? In general, this might be helpful for other features as well and does not need to live in cabal itself.

I think we can tick most of the boxes from above in the following way:

For now, re-export Cabal-syntax in Cabal (this couples it together again)
Implement a cli tool that makes the transition trivial (e.g. cabal-migrate --issue cabal-syntax)
- This requires Cabal-syntax exactprinter.
Stop re-exporting Cabal-syntax from Cabal
Point people to cabal-migrate (If we merge it into cabal directly, the error message can point to the command immediately once it notices what error has occurred).

Unfortunately, all of this is a fair amount of work (exactprinter, cabal-migrate) and might take longer than the release window we currently have. Maybe, it is best to just watch Hackage burn for a bit.

gbaz commented 2 years ago

I strongly suspect that most packages that build with the cabal library do so because of custom setup scripts, and furthermore those scripts only make use of one or two things exported by cabal-syntax at the very most (and perhaps most use nothing).

That said, @mpickering, I can't determine from the output of what you linked which packages fail and for what reason. I suspect that Distribution.Package may well be the primary reason they fail. If so, I would be open to a limited re-export of one or two key modules for now, if it has significant impact.

fgaz commented 2 years ago

Also the fix is trivial: just add Cabal-syntax to the dependencies. A dummy Cabal-syntax was released so there's no need for conditionals (please correct me if I'm wrong @patrickdoc)

patrickdoc commented 2 years ago

For end users, yeah, you should be able to just add Cabal-syntax.

For packages that want to support many versions of Cabal, you can use the conditionals recommended here: https://hackage.haskell.org/package/Cabal-syntax

mpickering commented 2 years ago

My experience is that you can't just add a Cabal-syntax dependency as the solver is free to pick different Cabal and Cabal-syntax versions. Therefore you end up with plans which include Cabal-3.2.0.0 and Cabal-syntax-3.7 which export modules of the same name. Perhaps I did something wrong on my end.

My latest attempt at patches is here and I need to debug why these are failing.

fgaz commented 2 years ago

Huh, I thought we made it so those kind of conflicts are excluded

patrickdoc commented 2 years ago

Cross-package dependencies are a little hard to express, and I don't think it's possible as a consumer of a library to express: "I'm not using Cabal, but because I'm using Cabal-syntax you must use Cabal > 3.6"

If all of your dependencies that require Cabal have applied the conditional linked above, then you can safely request only Cabal-syntax.

If any of your dependencies that require Cabal have not applied the conditionals, then your options are:

Force them to use a newer Cabal version by depending on it yourself (i.e. adding Cabal > 3.6 in your .cabal)
Add the conditionals (which will give you a working build by pessimistically defaulting to the older Cabal version)

In short, based on my understanding of how resolution works, you can't use Cabal-syntax until all of your dependencies that care have added support for it.

Edit: That came off a little more "final" than intended. The conclusion is that you can't end up with a solved dependency tree that includes Cabal-syntax until all of your dependencies that use Cabal have handled the split. Which is sort of understandable as that would be the result of any breaking change I think.

jneira commented 2 years ago

"I'm not using Cabal, but because I'm using Cabal-syntax you must use Cabal > 3.6"

hmmm and what about adding Cabal-syntax < 0 (or something equivalent) to all conflicting Cabal versions, via revisions if needed? EDIT: or even better Cabal-syntax < 3.7

jneira commented 2 years ago

"I'm not using Cabal, but because I'm using Cabal-syntax you must use Cabal > 3.6"

hmmm and what about adding Cabal-syntax < 0 (or something equivalent) to all conflicting Cabal versions, via revisions if needed? EDIT: or even better Cabal-syntax < 3.7

oh, i forgot you cant add new dependencies via revisions :-/ So it would need upload new Cabal minor versions for all major versions we want to be compatible. Maybe prevent breakage worths it. We could do it for all Cabal versions >= 3.0 and do it on demand for older versions

EDIT: It would not help packages with a exact bound (f.e. Cabal == 2.4), i hope they will not be many EDIT: i am not very proficient with regexps but this query only returns 3 packages: https://hackage-search.serokell.io/?q=Cabal.*%3D%3D%5B%5E%5C*%5D*%5B%2C%5Cn%5D

andreasabel commented 2 years ago

@jneira

EDIT: i am not very proficient with regexps but this query only returns 3 packages:

The regex Cabal[[:space:]]*==[ 0-9.*]+ finds some more. Of the 28 hits it returns, most should be valid.

jneira commented 2 years ago

Cabal[[:space:]]==[ 0-9.]+

@jneira

EDIT: i am not very proficient with regexps but this query only returns 3 packages:

The regex Cabal[[:space:]]*==[ 0-9.*]+ finds some more. Of the 28 hits it returns, most should be valid.

hmm i see a lot of them similar to Cabal == 3.6.* and that ones will get x.y.z.1, is it?

andreasabel commented 2 years ago

Looking at @mpickering 's patches, I am worried. Atm, there will be only one Cabal version with a -syntax sibling, but what if there are many. Do you need one impl(ghc) conditional for each ghc version to couple the correct -syntax package to Cabal?

To solve the problem of coupling Cabal and Cabal-syntax, couldn't we have:

Cabal-syntax.
Cabal-engine: This is current-Cabal minus Cabal-syntax.
Cabal as a shim that is simply Cabal-syntax + Cabal-engine.

If you depend on Cabal, you simply get what you used to get, but you could also depend on Cabal-syntax or depend on Cabal-engine with a different implementation of the Cabal syntax.

Would this work? (Maybe someone more immersed in the matter can answer this.)

jneira commented 2 years ago

Cabal as a shim that is simply Cabal-syntax + Cabal-engine.

And keep the shim idefinitely (or until all sensible packages switch to Cabal-syntax + Cabal-engine)? Maintainers would not have any reason to do it, as you are offering the shim with lastest fixes and features.

jneira commented 2 years ago

hmm i see a lot of them similar to Cabal == 3.6.* and that ones will get x.y.z.1, is it?

Moreover hackage revisions could be done for those cases Cabal == x.y.z

mpickering commented 2 years ago

It turns out your can't write conditionals in a setup-depends field so for head.hackage I have

Remove the allow-newer: Cabal from the cabal.project
Added upper bounds (Cabal < 3.7) for all relevant packages with Setup.hs scripts

Mikolaj commented 2 years ago

2. Added upper bounds (Cabal < 3.7) for all relevant packages with Setup.hs scripts

Sadly, that won't work in the long run --- all custom script packages will be stuck at Cabal 3.6 until that starts conflicting with other things and then they should all at once switch to >= 3.7 (can be done in bulk in Hackage, but it won't work if some of them have genuine problems with newer Cabals).

andreasabel commented 2 years ago

Cabal as a shim that is simply Cabal-syntax + Cabal-engine.

And keep the shim idefinitely (or until all sensible packages switch to Cabal-syntax + Cabal-engine)? Maintainers would not have any reason to do it, as you are offering the shim with lastest fixes and features.

Yes, this shim could be auto-generated by just reexporting the modules of these two packages (or is there a lighter way?). Maintainers can chose to switch whenever it gives them benefits (like shorter compilation times if they only need Cabal-syntax).

If we want to nudge folks to get away from the shim Cabal, we could have deprecation warnings. But why, really?

mpickering commented 2 years ago

2. Added upper bounds (Cabal < 3.7) for all relevant packages with Setup.hs scripts
Sadly, that won't work in the long run --- all custom script packages will be stuck at Cabal 3.6 until that starts conflicting with other things and then they should all at once switch to >= 3.7 (can be done in bulk in Hackage, but it won't work if some of them have genuine problems with newer Cabals).

Right, this just unblocks me for head.hackage, not a long-term solution.

Mikolaj commented 2 years ago

Yes, this shim could be auto-generated by just reexporting the modules of these two packages (or is there a lighter way?). Maintainers can chose to switch whenever it gives them benefits (like shorter compilation times if they only need Cabal-syntax).

I'm +1 on the shim. What's the cheapest scheme?

If we want to nudge folks to get away from the shim Cabal, we could have deprecation warnings. But why, really?

Custom setups are destined to die anyway, so there's not much point helping users improve them. But for tool builders depending on only Cabal-syntax or only Cabal-engine may have many advantages. OTOH, they are quite likely to become aware of the split just from changelogs or looking at the source, without any aggressive advertising. So I wouldn't worry.

Mikolaj commented 2 years ago

A wizard of yore tells me

There is literally zero use cases for Cabal-engine alone. Except maybe something like main = defaultMain Most of Cabal-engine types are defined in Cabal-syntax

which makes me +0.5 on the shim and +0.5 on full re-export. If we want limited and mostly opt-in breakage in the short and in the long run, do we have any other options?

gbaz commented 2 years ago

Can somebody link to or inventory which packages on head.hackage require changes, and why? (i.e. what module in particular do they depend on that requires a custom invocation of cabal-syntax).

I gather basically all these packages are packages with custom setup scripts of some kind -- is this correct? These details will help inform a good approach.

fgaz commented 2 years ago

Looking at the above, I agree with reexporting some or all modules for a limited amount of time, that is until packages can just have setup-depends: Cabal >=3.7 < something, Cabal-syntax and still have a wide enough range of cabals.

@andreasabel

Looking at @mpickering 's patches, I am worried. Atm, there will be only one Cabal version with a -syntax sibling, but what if there are many. Do you need one impl(ghc) conditional for each ghc version to couple the correct -syntax package to Cabal?

as I understand it, only the split is a problem

mpickering commented 2 years ago

@gbaz The head.hackage patch is here, it's only a few packages but it is pretty awkward.

https://gitlab.haskell.org/ghc/head.hackage/-/merge_requests/205

gbaz commented 2 years ago

Ok, so it seems the impact really is fundamentally on custom setup packages, and particularly only on a few modules commonly used by them. That said, the modules commonly used end up reexporting a ton more functions from other modules themselves.

see e.g. https://hackage.haskell.org/package/Cabal-3.6.2.0/docs/Distribution-PackageDescription.html

That said, I think that re-exports of the following set of modules (give or take) should probably suffice:

Distribution.Package, Distribution.PackageDescription, Distribution.System, Distribution.Version, Distribution.Compiler

I don't know if there's a way to test if exports of this set unblock the bulk of head.hackage, but it would be good if this could be checked.

phadej commented 2 years ago

@gbaz the problem is that custom-setup doesn't allow any kind of conditionals, so we cannot write

custom-setup
  setup-depends: base, Cabal

  if flag(Cabal-syntax)
    setup-depends: Cabal-syntax >=3.8, Cabal >=3.8
  else
    setup-depends: Cabal <3.8

Note:

custom-setup
  setup-depends: base, Cabal, Cabal-syntax

would allow solver to pick up e.g. Cabal-3.6 and Cabal-syntax-3.8. And in fact it will, when you use GHC-9.2, as Cabal-3.6 is bundled one, thus preferred.

And that will lead into ambiguous module imports.

Thus a less "wrong" way is to specify:

custom-setup
  setup-depends: base, Cabal >=3.8, Cabal-syntax >=3.8

forcing older compiler users to also newer Cabal.

It's slightly inconvenient, although maybe it's ok price to pay for using build-type: Custom to begin with.

If asking build-type: Custom people to force use Cabal >=3.8, no re-exporting would be required.
If you don't want to do it, then re-exporting everything is safer.
Adding conditionals support to custom-setup would be great for other reasons too.

Not allowing flags (i.e. "dynamic" conditionals) will be enough as well, if it's simpler (= quicker) to implement.

custom-setup
  setup-depends: base, Cabal

  if impl(ghc >=9.4)
    setup-depends: Cabal-syntax >=3.8, Cabal >=3.8
  else
    setup-depends: Cabal <3.8

is good enough approximation.

gbaz commented 2 years ago

I don't think there's any need for conditionals if people just want to depend on Cabal. If they depend on both Cabal and Cabal-syntax, then the bound on Cabal-syntax should ensure that only the empty Cabal-syntax file is compatible with pre-split cabals.

The only need for conditionals is when people want to depend on just cabal-syntax, but also be able to build with pre-split versions of cabal instead.

The issue that I think we want to solve here is entirely unrelated to conditionals. It is simply that without module re-exports, many packages that used to build with Cabal must also build with Cabal-syntax if the are building with a post-split cabal.

So going through and forcing all packages which used to depend only on Cabal to depend on both Cabal and cabal-syntax should work, but that affects lots of packages with custom setup-depends stanzas.

This leads me to a rather evil idea.

What if we teach cabal itself that any package which has a setup-depends requiring Cabal also automatically requires Cabal-syntax, and alter the default settings the same way.

I think this should Just Work (tm).

phadej commented 2 years ago

If they depend on both Cabal and Cabal-syntax, then the bound on Cabal-syntax should ensure that only the empty Cabal-syntax file is compatible with pre-split cabals.

Which bound? The future Cabal-syntax-3.8 and e.g. Cabal-3.6 could exist in the same install plan. Cabal-syntax-3.8 doesn't have a dependency on Cabal. The build-depends: Cabal <3.7 in Cabal-syntax-3.6 is not sufficient.

Mikolaj commented 2 years ago

I think this should Just Work (tm).

Not with old cabal-installs, though, as much as I appreciate the evil genius of the idea.

gbaz commented 2 years ago

Which bound? The future Cabal-syntax-3.8 and e.g. Cabal-3.6 could exist in the same install plan. Cabal-syntax-3.8 doesn't have a dependency on Cabal. The build-depends: Cabal <3.7 in Cabal-syntax-3.6 is not sufficient.

Hrm, good point. Drat. Would it be reasonable to revise older cabals to force them to rely on the dummy cabal-syntax? (I know such revisions would require an exemption on hackage to allow, but we could do so).

If this seems like too much complexity, and if we're really worried about forcing too many custom setup users to upgrade, I suppose reexporting some modules (again, starting with the ones I proposed above) is the only way to go.

fgaz commented 2 years ago

Would it be reasonable to revise older cabals to force them to rely on the dummy cabal-syntax

No, unfortunately revisions can't add dependencies

gbaz commented 2 years ago

(unless we patch hackage to allow an exception, which we can [evil grin])

jneira commented 2 years ago

(unless we patch hackage to allow an exception, which we can [evil grin])

the-gates-of-hell-open

bgamari commented 2 years ago

I completely understand the reluctance to reexport Cabal-syntax from Cabal. On the other hand, I'm quite worried about the impact of this change on users. One approach to mitigate this would be to introduce Cabal as a metapackage which reexports Cabal-syntax along with the current contents of Cabal, which would then live in a new package (e.g. Cabal-internal). This would avoid immediate breakage while preserving the ability of users to transition to Cabal-syntax if it is sufficient.

Does this sound like a plausible approach?

gbaz commented 2 years ago

I think at this point the actual cost of reexporting is not so great, and we probably should just do it. If someone could try to enumerate the disadvantages of doing so clearly, it would be good to see that all in one place so we can evaluate it.

Mikolaj commented 2 years ago

@bgamari: what you propose looks similar to the shim idea discussed starting at https://github.com/haskell/cabal/issues/7974#issuecomment-1041257275

For the shim idea we agreed that reexporting seems preferable in this particular case, but if you have new arguments, let's restart the discussion.

bgamari commented 2 years ago

@bgamari: what you propose looks similar to the shim idea discussed starting at #7974 (comment)

Ahh, apologies! Somehow I missed that comment. Yes, this is precisely what I was thinking. Sorry for the noise.

Mikolaj commented 2 years ago

It seems nobody objects to the idea of (temporarily) re-exporting Cabal-syntax from Cabal. We talked at today's dev call and probably the easiest way is to re-export everything. We might also immediately warn in release notes that this shim is going to be removed at some point. We could also, at once or not, deprecate all the re-exported things with proper pragmas, somehow explaining that they are still available, but in Cabal-syntax.

Any last minute objections? Any refinements of the idea? Would anybody like to implement that after we wait for a few days for any objections?

gbaz commented 2 years ago

In writing up some notes on this, I just also came up with one more "evil trick" that we can consider -- we "simulate" flags by introducing a new "virtual flag" via a phantom dependency.

Create a new package, "cabal-syntax-split" that is empty and has two versions, call them 1 and 2. Force cabal-syntax to depend on version 2. Revise all pre-split versions of Cabal to depend on version 1. This resolves the issue raised by @phadej that

Which bound? The future Cabal-syntax-3.8 and e.g. Cabal-3.6 could exist in the same install plan. Cabal-syntax-3.8 doesn't have a dependency on Cabal. The build-depends: Cabal <3.7 in Cabal-syntax-3.6 is not sufficient.

which also I think was what motivated mpickering to raise this issue when he wrote that

My experience is that you can't just add a Cabal-syntax dependency as the solver is free to pick different Cabal and Cabal-syntax versions. Therefore you end up with plans which include Cabal-3.2.0.0 and Cabal-syntax-3.7 which export modules of the same name.

Sorry to raise this only just after our meeting, but it just only occurred to me.

phadej commented 2 years ago

Revise all pre-split versions of Cabal to depend on version 1.

This is strictly worse (= new package) then just revising Cabal to depend on empty Cabal-syntax.

gbaz commented 2 years ago

Fair point. Why don't we just revise all prior cabals to depend on empty cabal-syntax then?

Mikolaj commented 2 years ago

What if a sneaky cabal hides somewhere and we don't apprehend it and don't revise it?

E.g., perhaps some Hackage mirrors that people use have some extra versions of cabal? I can't easily imagine other failure scenarios but are we sure there are none?

haskell / cabal

Cabal should perhaps reexport all the modules from Cabal-syntax to reduce breakage #7974