haskell / cabal

Official upstream development repository for Cabal and cabal-install
https://haskell.org/cabal
Other
1.62k stars 697 forks source link

[RFC] `package-constraints` #9477

Closed jasagredo closed 10 months ago

jasagredo commented 11 months ago

Problem this solves

When describing a package in a .cabal file, multiple components can declare the same dependency in their build-depends sections. There can be two flavors of this:

  1. Components actually depend on different versions of the dependency, for example if we are using one version of aeson in one test-suite and a different version in another, in order to test backwards compatibility always.
  2. Components depend on the same version of the dependency.

It is to case (2) that we turn our attention in this RFC. On the one hand it feels redundant and error-prone to repeat the version bound in all components. On the other hand, in order to make sure you are building with the same version regardless of what subset of those components is enabled one has to perform a topological-sort of the components and assign the version bound in the bottom-most component.

A different solution is to use common stanzas as done by some packages but this leads to a weird syntax, where the import section of a component is importing both configurations and build-depends sections.

Proposed solution

Implement a new section in the .cabal package file named package-constraints which contains a list of dependencies in the same format as a build-depends.

For each of the dependencies in the package-constraints, whenever a dependency on the same package appears in a build-depends section of a component, the version declared in the component will be intersected with the version from package-constraints. In particular, if the build-depends of the component declares no version bound, it will inherit the one from package-constraints.

For specific cases like (1) above, the user should not include such a special dependency in these package-constraints section.

This will be just a syntactic sugar and will have no influence in:

Backwards compatibility

This change would be fully backwards-compatible as there existed no field with this name before and omitting it would work the same way as before this change, i.e. it is an optional field.

michaelpj commented 11 months ago

Sounds good to me.

A risk is that package-constraints sounds a lot like constraints, and users might expect it to behave the same as cabal.project constraints, whereas it in fact doesn't (due to it not applying if the dependency isn't used).

I think that's probably not too confusing, but it is a potential footgun.

gbaz commented 11 months ago

component-depends-constraints ?

Mikolaj commented 11 months ago

component-deps-constraints? (or make it an alias of the longer form)

But I'm fine with the original name as well.

ulysses4ever commented 11 months ago

The original is fine with me. The alternative looks ugly (I'm sorry!).

andreabedini commented 11 months ago

Thank you for opening a RFC.

When describing a package in a .cabal file, multiple components can declare the same dependency in their build-depends sections. There can be two flavors of this: Components actually depend on different versions of the dependency, for example if we are using one version of aeson in one test-suite and a different version in another, in order to test backwards compatibility always. Components depend on the same version of the dependency.

We need to be more precise. Each component can list their own dependencies, which are made of a package name, an optional library name and a version range.

AFAIK the dependencies of all components are merged toghether when solving for a plan. This includes the dependencies of executable components (at least in the v2 code paths). So if two components specify different ranges for aeson, the solver will only pick a version of aeson that satisfies both version ranges.

Tests and benchmarks components are a bit special, because they can be enabled or disabled, either automatically or manually, and when they are disabled their dependencies are not taken into account.

Therefore the following does not seem to make much sense (in general).

On the other hand, in order to make sure you are building with the same version regardless of what subset of those components is enabled one has to perform a topological-sort of the components and assign the version bound in the bottom-most component.

I agree that listing build-depends for each component, only to have them always merged togheter, is repetitive and confusing. I would not be opposed to a better way to express this.

AFAIU these constraints would not imply a dependency but only a version range to intersect with? I am afraid the difference between a build-deps and a package-constraint might be as confusing as separate build-deps per components. I wonder if it could make sense a package level build-deps, it could be simpler for those who don't want to be bothered making a per-component distinction.

I think this needs to be understood and discussed more before we can make a decision.

michaelpj commented 11 months ago

I just want to repeat a point discussed in the other issue: one of the reasons for doing it the way that is proposed rather than making this a constraint that goes into the solver is that we do feel like we want to state a local constraint. That is, we only want the constraints to apply to the dependencies generated by the local components, and not those elsewhere in the build plan.

The case we want to avoid is this:

package-constraints: foo > 2

library
   build-depends: foo

executable wot
   build-depends bar

where bar depends on foo. Now if we try to build wot it seems that we should not incur the foo > 2 constraint when choosing which foo to use to build bar.

(Having written this, I once again am uncertain whether this is true. I think the solver knows about qualified goals and implication goals, so maybe it is possible to add a goal that says "if you need to pick x when trying to solve for y, then here is a constraint on x".)

I wonder if it could make sense a package level build-deps, it could be simpler for those who don't want to be bothered making a per-component distinction.

There is a key difference between this and build-depends: build-depends means:

  1. Pass flags to GHC telling it to use this package during compilation, and
  2. Link this package into the final build product

Both of these are quite undesirable if you have some components that don't use a dependency. So we really do want to say that this is only an instruction to the solver if the dependency is used, and not an instruction to use it.

mpickering commented 11 months ago

Components actually depend on different versions of the dependency, for example if we are using one version of aeson in one test-suite and a different version in another, in order to test backwards compatibility always.

As @andreabedini points out, this isn't possible, only one version of aeson will be selected.


I think it could be a bit confusing how this is quite a syntactic feature and the constraints specified here will operate in quite a different way to if you specify --constraint=.. via the command line. For example, if you specify a constraint on a package which isn't a direct build dependency, then I assume it's just ignored under this proposal?

Also, it should be considered how these constraints should interact with qualified contexts:

andreabedini commented 11 months ago

@michaelpj

where bar depends on foo. Now if we try to build wot it seems that we should not incur the foo > 2 constraint when choosing which foo to use to build bar.

I am not sure I follow what you mean with "the case we want to avoid is" but executable dependencies and library dependencies are merged together (top-level namespace/goal? I am not sure what the correct term is) so foo > 2 will always apply.

one of the reasons for doing it the way that is proposed rather than making this a constraint that goes into the solver is that we do feel like we want to state a local constraint.

The thing is that we already have those, and they are the version ranges in build-depends :-)

Both of these are quite undesirable if you have some components that don't use a dependency. So we really do want to say that this is only an instruction to the solver if the dependency is used, and not an instruction to use it.

In that phrase I actually meant "add this dependency to all components in the package". I am aware that is not what it is proposed here.

Let's start from the beginning. What problem does this want to solve? Less repetition? More clarity w.r.t the required version ranges across components?

michaelpj commented 11 months ago

I am not sure I follow what you mean with "the case we want to avoid is" but executable dependencies and library dependencies are merged together (top-level namespace/goal? I am not sure what the correct term is) so foo > 2 will always apply.

Ah sorry, perhaps I am assuming that https://github.com/haskell/cabal/issues/4087 is done. But hopefully it will be done at some point! So we don't want this to work badly with per-component solving.

The thing is that we already have those, and they are the version ranges in build-depends :-)

Sure, but the problem is that build-depends does more than that. As I commented in the other issue, if we had a way to say, per-component, "please use this constraint only if you need to pick this dependency", then we could just use common stanzas to apply that to each component as needed.

Let's start from the beginning. What problem does this want to solve? Less repetition? More clarity w.r.t the required version ranges across components?

The aim is to improve the common case of "I have many components, and I just want them all to use the same bounds for foo (if they use it)". I think this what I want for approximately every package I have ever maintained, with a very few exceptions where I wanted different bounds for one component. At the moment there are a few ways to achieve this:

  1. Repeat the constraint for every entry in build-depends for every component (tedious and error-prone)
  2. Super ugly common stanzas (like cabal-cache)
  3. Just don't put constraints on components other than the library component (non-tedious but error-prone (what if you have a component that doesn't depend on the library? or uses a dependency that isn't constrained by the library?) and cabal check might reasonably be unhappy)

What we want is a way to state directly the intention and have cabal do the right thing. The questions we are trying to answer are:

  1. What precisely is the intention? Specifically
    • Should the constraint for foo apply if solving for a component that doesn't depend on foo directly?
  2. How do we convey the intention to cabal? Specifically
    • Top-level stanza or per-component stanza plus common stanzas?
    • Syntactic rewrite or somehow something the solver is aware of?

I think it could be a bit confusing how this is quite a syntactic feature and the constraints specified here will operate in quite a different way to if you specify --constraint=.. via the command line. For example, if you specify a constraint on a package which isn't a direct build dependency, then I assume it's just ignored under this proposal?

I do agree with this, and it makes me somewhat reluctant to use the "constraint" naming. The other way of looking at it might be: how bad would it be if it worked just like constraints, i.e. the constraints were applied globally even if the constrained package isn't a build dependency. That might just be fine in practice (and in fact I've had times when I wanted to do this).

If we did this I'd be tempted to go for a per-component constraints stanza, and just use common stanzas to apply it to each component. Or does this only make sense once we have per-component solving?

My main qualm would be that I think until now cabal files have been exclusively "local", i.e. they talk only about the direct dependencies of the package. This feature would allow cabal files to instead say things about arbitrary other packages somewhere in the build graph. I don't know if that's bad, but I think it's new.

andreabedini commented 11 months ago

Ah sorry, perhaps I am assuming that #4087 is done. But hopefully it will be done at some point! So we don't want this to work badly with per-component solving.

This shows our different point of view. My understanding is that #4087 still needs a clear design and I am tempted to say that it might not even be a feature we want because it could introduce more problems than it solves. So, I am not assuming it is going to land anytime soon.

The aim is to improve the common case of "I have many components, and I just want them all to use the same bounds for foo (if they use it)". I think this what I want for approximately every package I have ever maintained, with a very few exceptions where I wanted different bounds for one component. At the moment there are a few ways to achieve this:

Why not do the simplest reasonable thing?

common common-to-all
  build-depends:
      base                           >= 4.7        && < 5
    , aeson                          >= 1.4.2.0    && < 2.2
    , amazonka                       >= 2          && < 3
    , bytestring                     >= 0.10.8.2   && < 0.12
    , directory                      >= 1.3.3.0    && < 1.4
    , exceptions                     >= 0.10.1     && < 0.11
    , filepath                       >= 1.3        && < 1.5
    , lens                           >= 4.17       && < 6
    , mtl                            >= 2.2.2      && < 2.4
    , network-uri                    >= 2.6.4.1    && < 2.8
    , oops                           >= 0.2        && < 0.3
    , text                           >= 1.2.3.1    && < 2.1

common common-to-lib-and-exe
  build-depends:
      amazonka-core                  >= 2          && < 3
    , containers                     >= 0.6.0.1    && < 0.7
    , optparse-applicative           >= 0.14       && < 0.18
    , generic-lens                   >= 1.1.0.0    && < 2.3
    , resourcet                      >= 1.2.2      && < 1.4
    , stm                            >= 2.5.0.0    && < 3

library
  import:
      common-to-all
    , common-to-lib-and-exe
  build-depends:
      amazonka-s3                    >= 2          && < 3
    , attoparsec                     >= 0.14       && < 0.15
    , conduit-extra                  >= 1.3.1.1    && < 1.4
    , cryptonite                     >= 0.25       && < 1
    , deepseq                        >= 1.4.4.0    && < 1.5
    , http-client                    >= 0.5.14     && < 0.8
    , http-client-tls                >= 0.3        && < 0.4
    , http-types                     >= 0.12.3     && < 0.13
    , process                        >= 1.6.5.0    && < 1.7
    , relation                       >= 0.5        && < 0.6
    , topograph                      >= 1          && < 2 
    , transformers                   >= 0.5.6.2    && < 0.7

executable cabal-cache
  import:
      common-to-all
    , common-to-lib-and-exe
  build-depends:
      cabal-install-parsers          >= 0.6.1      && < 0.7
    , stringsearch                   >= 0.3.6.6    && < 0.4
    , temporary                      >= 1.3        && < 1.4
    , unliftio                       >= 0.2.10     && < 0.3

test-suite cabal-cache-test
  import:
      common-to-all
  build-depends:
      Glob                           >= 0.10.2     && < 0.11
    , hedgehog                       >= 1.0        && < 1.3
    , hedgehog-extras                >= 0.4        && < 0.5
    , hspec                          >= 2.4        && < 3
    , http-types                     >= 0.12.3     && < 0.13
    , hw-hspec-hedgehog              >= 0.1.0.4    && < 0.2
    , raw-strings-qq                 >= 1.1        && < 2
    , time                           >= 1.4        && < 1.13

Note that I intentionally repeated http-types in both the library and the test, because why not, it is still reasonable. Also note that the common stanza hw-hedgehog was defined but not used anywhere.

My main qualm would be that I think until now cabal files have been exclusively "local", i.e. they talk only about the direct dependencies of the package. This feature would allow cabal files to instead say things about arbitrary other packages somewhere in the build graph. I don't know if that's bad, but I think it's new.

There are been cases where build plans get broken by a "non-local" effect. E.g. we see many when a package splits in two (network/network-uri, Cabal/Cabal-syntax). But I argue we should not rush to allow this kind of "non-local constraints" (i.e. constraints of non-dependencies) without a detailed analysis.

Anyway, this is not what the proposal is about. The proposal makes clear that this new annotation only provides a version range to matching build-depends with no vesion range.

My judgment is that it is not worth it. The discussion around what name can better communicate the subtle meaning of the stanza shows that this feature adds complexity and potential confusion. A reasonable use of common stanza like I suggested above obtains the same goal with clarity and minimum repetition.

michaelpj commented 11 months ago

Why not do the simplest reasonable thing?

Because it looks like a PITA to maintain? Every time I modify any dependency I have to check the other components and do some set intersection in my head. And I'm not sure how to do it systematically for larger packages such as plutus-core which has 21 components (!). One systematic way looks like what cabal-cache does, and I really can't face recommending that to people.

I feel a little like you're saying "no, this isn't actually annoying", which just tells me that you haven't actually tried to maintain a package where this is annoying :p We do already know about common stanzas!

Anyway, this is not what the proposal is about. The proposal makes clear that this new annotation only provides a version range to matching build-depends with no vesion range.

Well, the idea is to solve the problem, and the version with true constraints would also solve the problem, with the benefit of clarity and the downside of excess power.

What if we allowed full constraints but made cabal check warn if there is a constraint that applies to a package not listed in the build-depends of some component?

Mikolaj commented 11 months ago

What if we allowed full constraints but made cabal check warn if there is a constraint that applies to a package not listed in the build-depends of some component?

@ffaf1, since you maintain cabal check --- any obvious problems with that check? I guess it needs to be clarified first?

jasagredo commented 11 months ago

The way it is implemented right now follows the "if this component is not selected, do not apply this bound". So as an example:

package-constraints:
  bar <0

library
  build-depends: ... -- not bar

test-suite test
  build-depends:
    bar

will result in cabal build succeeding and cabal build --enable-tests failing, as described above.

jasagredo commented 11 months ago

@andreabedini The main benefit of this change is to not have to manually topologically sort your components (to apply the bound in the bottom-most component), or have to make combinatorial common stanzas, or put everything in common stanzas, but instead provide bounds that will be applied always.

I agree the name package-constraints might be confusing, because they are not the same as constraints. What about package-bounds? That conveys the fact that these are bounds that will be used in the package, but not "constraints" as generally understood.

@mpickering I did not check what happens with setup dependencies. I'm passing this when converting the cond tree to a solver list of dependencies. I think I could make it apply to those other contexts too if wanted. However build-tool-depends are Exe dependencies, not normal dependencies, but maybe I can work out something.

andreabedini commented 11 months ago

@michaelpj

I feel a little like you're saying "no, this isn't actually annoying", which just tells me that you haven't actually tried to maintain a package where this is annoying :p We do already know about common stanzas!

Alright, that's a fair rebuttal :relaxed:

What if we allowed full constraints but made cabal check warn if there is a constraint that applies to a package not listed in the build-depends of some component?

I think this is good alternative, and thinking about it, it is surprising we don't do it already. Having foo > 1 in lib and foo < 1 in exe never works and cabal should definitely warn the user.

It is not immediate how to do this without closing the door to a possible future with per-component solving. Maybe we can still consider a package description with different bounds to common dependencies as valid but cabal can emit a notice at build time?

e.g.

❯ cabal build
Resolving dependencies...
Note: component foo and bar have a common dependency on some-pkg but specify different version ranges, the effective version range is ...

TBH, I think this is a clear UX improvement (beside the exact formulation of the message of course).

I think this would need to be done by both Cabal can cabal-install actually.

michaelpj commented 11 months ago

It is not immediate how to do this without closing the door to a possible future with per-component solving.

I would propose something like:

library foo
  constraints: bar < 3
  build-depends: bar

so then you can do

common bounds
   constraints: bar < 3

library foo
  import bounds
  build-depends bar

i.e. we do things at the component level and use common stanzas to get it across components.

Maybe we can still consider a package description with different bounds to common dependencies as valid but cabal can emit a notice at build time?

That seems sensible to me. This has the advantage that we can relax those descriptions if/when we get per-component solving.

jasagredo commented 11 months ago

On Dec 7th in the Cabal devs call, we reached an agreement (which we can still iterate on). Do correct me if you think I misunderstood any part of the conclusion, please. It probably deserves a separate RFC as this one would be rejected.

grayjay commented 11 months ago

I wanted to mention an idea for making the use of default bounds clearer, though it is probably too ugly and verbose in the common use case. We could add a new syntax, such as == default to use the default bounds for a build-depends dependency. Then the default bounds would only apply to bytestring in this example:

  build-depends:
    base >= 4.9,
    bytestring == default,
    containers
michaelpj commented 10 months ago

When we discussed this I made the assertion that sometimes it's useful to constrain an indirect dependency. I tried to write down my examples here, but I think they are in fact not convincing.

jasagredo commented 10 months ago

Obsoleted by #9569