Write a policy about the use of version bounds in Haskell packages

michaelpj commented 2 years ago

This is a big topic. At the moment we mostly get by with the use of index-state to get consistent behaviour from the solver, but this is not a panacea. Different projects can use different index-states and there is no guarantee that newer ones will work. Moreover, errors are not obvious and can appear in distant upstream projects, a problem which gets more painful the deeper our dependency tree gets (consider the plight of Hydra). So I think some use of bounds is probably good.

Here's my proposal:

1. Dependencies on non-Cardano packages

a) Bounds that exclude known-broken versions

Often we know that certain versions won't work. Typically we get this information in one of two ways

We bump our index state, which gets us foo-X+1 instead of foo-X
Someone else did this and complains to us.

Then we have two options:

Fix the errors for foo-X+1. Now we know that foo-X won't work, so adding a lower bound is cheap and correct.
Defer fixing foo-X+1. Now we know that foo-X+1 won't work, so adding an upper bound is cheap and correct.

These are nice things to do: they give downstream consumers better errors

If you require foo >= X+1 and they either don't have it (old index-state) or can't have it (conflict), they'll get a solver error, instead of the solver happily picking foo-X and them getting an obscure compilation error in your package.
If you require foo < X+1 and they require it (conflict), again they'll get a solver error instead of picking foo-X+1 and getting an compilation error in your project.

b) Speculative upper bounds

It's somewhat common for people to add upper bounds that they don't know are broken, often on the next major version. I think we should not do this: this is one of the most common reasons to need allow-newer, and it's pretending to have information we don't have. This is maybe a point of contention.

2. Dependencies on Cardano packages

The arguments above also apply here, but we have additional information: namely that we have very little versioning discipline and essentially every release breaks things. That suggests that we could take a more aggressive policy: pin the major+minor version of any Cardano packages that you depend on (probably fine to let the patch version vary). i.e. do cardano-foo == x.y.z.* or similar.

This also means that updating to a new release of a Cardano package must be a conscious decision by the maintainer (because you have to change your version bounds). This is probably what we want!

3. Any component that isn't a library

Nobody else can depend on these, so who cares. You can probably just leave out the version bounds and rely on your local index-state for anyone who actually wants to use them.

michaelpj commented 2 years ago

People who I know care about this: @erikd @dcoutts @andreabedini @angerman

erikd commented 2 years ago

I completely agree with point 1, including point 1b . I think speculative upper bounds are more trouble than they are worth.

With point 2 I wonder it version numbers on our cardano packages are actually going to follow something like PVP and correctly reflect all changes in their API (many of these packages do not currently have an API, they just expose their internals).

michaelpj commented 2 years ago

If we do start doing that (which would be nice!) then we could relax the policy to just pinning the major version, I think? Since we probably still want updating that to be a conscious decision. Perhaps that's fine already and we can just yell at people if they break things in a minor version bump until they stop.

andreabedini commented 2 years ago

IMHO, we should stick with PVP including "speculative upper-bounds".

My arguments are as follows:

We publish cardano packages on CHaP not only for internal DX but because we want people to use them. Therefore one of our goals should be making them easy to use.
Anything that is written in cabal.project is irrelevant for package users, since that information is not part of the published package.
Version bounds (both lower and upper) contains the information that our packages are known to work within the given ranges of their dependencies. Please note: known to work within range != known to not work outside range.
Any user consuming a package with both lower and upper bounds will automatically have a plan that works and will be able to use the package.

Let's compare failure modes:

The "failure" mode of a too conservative upper-bound is still a working plan, ableith with some packages that might older than necessary. I think this is still a win, as the user has a working starting point, which can be perfected (locally with "allow-newer", globally with a cabal file revision in CHaP).
The "failure" mode of lack of version bounds in our packages is one of our packages not compiling out of the box. To make things worse, there's no clear indication of what the curpit might be, since any major version change in any depenency (including transitive ones, I think) could be causing the failure. This leaves our users in the dark. We have copious evidence of this internally, as we struggle to find which version works with which. At least, once the missing bound is found, the situation is easy to fix (as above: locally with "constraints" or globally with a revision).

michaelpj commented 2 years ago

I'm not sure I quite agree that 6 is easier than 5. Let's suppose that packages A-Z depend on vector, and have put in speculative bounds of vector < N. However, all of them build with vector-N except Z. Now, suppose I want to depend on vector-N. I can go through the cabal solver errors one by one, adding allow-newers to discard the (wrong) upper bounds, until eventually I do it for Z and then I get a compile error in Z. Okay, so Z needs the upper bound, I can go and get Z fixed and released in a new version. And I guess I should revise all of A-Y? :(

In the other world, my build goes through fine and just stops on exactly the problem: Z failing to build. Great, so I can revise Z to have an upper bound at it's current version (since I now know that's broken), and see if I can get Z fixed to work with vector-N in a new version.

I think my hypothesis is that:

Missing upper bounds that cause compilation failures happen somewhat rarely, and can be fixed globally (CHaP revisions), and manifest where the problem is (in the package that needs the bound).
Unnecessary upper bounds that block the plan that you want can happen a lot, need a lot of busy-work to fix, and manifest all over the place.

lehins commented 2 years ago

we should stick with PVP including "speculative upper-bounds".

Over-restrictive upper bounds require too much work for very little benefit. It is a lot more often that a major bump in a package will not break packages that depend on it, then the breakage will occur. Moreover, a major version bump to a popular package like, bytestring, vector, container etc. would require:

a PR to every single repo with Haskell packages in input-output-hk organization with an update to the cabal file for every package that uses that dependency
then every one such package will need a release to CHaP
and then each repo that is a downstream user of above mentioned package will need a PR with an update to an index-state.

To give you an example, it took exactly a year for stackage to get random-1.2 because most of downstream users in stackage had upper bounds on it. That fun fact is that out of thousands of packages on Hackage that depended on random only about a dozen were actually broken. Huge waste of developers' time IMHO!

Unless we have a person that will have this as a top responsibility to update all packages in input-output-hk organization and keep track of version updates to all of the packages we depend on so we don't keep using outdated packages, I highly advise against it!

lehins commented 2 years ago

Here is another example: vector-0.13: https://github.com/commercialhaskell/stackage/issues/6624

I made a release of vector-0.13 over 4 months ago on hackage and stackage records only 29 out of 141 packages on stackage that had issues with the new version actually updated their bounds. I suspect it will be months before maintainers of all dependencies will update those damn upper bounds and everyone will be able to use vector-0.13 I am pretty sure only a small percentage of dependencies had actual compatibility issues with the new vector release.

Again, waste of time.

andreabedini commented 2 years ago

@michaelpj

I'm not sure I quite agree that 6 is easier than 5.

It was my intention to say that situation 5 (over conservative plan) is a much better place to be than 6 (failing build plan). But I think I can comment on what follows anyway.

Let's suppose that packages A-Z depend on vector, and have put in speculative bounds of vector < N. However, all of them build with vector-N except Z. Now, suppose I want to depend on vector-N. I can go through the cabal solver errors one by one, adding allow-newers to discard the (wrong) upper bounds, until eventually I do it for Z and then I get a compile error in Z. Okay, so Z needs the upper bound, I can go and get Z fixed and released in a new version. And I guess I should revise all of A-Y? :(

First, I have an issue with the use of word "wrong", in "(wrong) upper bound". A version bound is a guaratee and as such it is not wrong. "My package works with vector version less than equal to N" is not disproven by the fact that my package might work with vector version N+1.

Second, yes, of course. In the situation you describe you either need "allow-newer: vector" in your project (which I call a "local" solution"), or a revision for the packages A to W (which I call a global solution since it improves the situation for everyone). Of course everything is stuck until Z releases a new version.

In the other world, my build goes through fine and just stops on exactly the problem: Z failing to build. Great, so I can revise Z to have an upper bound at it's current version (since I now know that's broken), and see if I can get Z fixed to work with vector-N in a new version.

But you started in the best possible scenario! You have assumed you had a valid build plan before you started relaxing the bounds of vector to see what happens.

IMHO the more critical scenario is: some time later, after many breaking changes by many dependencies, someone tries to compile your package and it's a nightmare to figure out what versions are needed to make a valid build plan. Which is very similar to the situation we find ourselves in while building CHaP.

* Missing upper bounds that cause compilation failures happen somewhat rarely, and can be fixed globally (CHaP revisions), and manifest where the problem is (in the package that needs the bound).

I agree that they are easy to fix (when you know where the problem is), and I agree with the solution (upper bounds). I disagree that happen rarely (didn't you and I just spend a few weeks trying to get packages to compile?), I believe we can prevent these problems from happening at all. Think of our users! :D

* Unnecessary upper bounds that block the plan that you want can happen a lot, need a lot of busy-work to fix, and manifest all over the place.

I believe a smart contract developer would prefer having a working plan with vector version N-1 rather than a build error, and that fixing build errors is a more painful busy-work than making a revision.

andreabedini commented 2 years ago

@lehins thanks for chiming in.

Over-restrictive upper bounds require too much work for very little benefit. It is a lot more often that a major bump in a package will not break packages that depend on it, then the breakage will occur. Moreover, a major version bump to a popular package like, bytestring, vector, container etc. would require:
* a PR to every single repo with Haskell packages in [`input-output-hk` organization](https://github.com/input-output-hk) with an update to the cabal file for every package that uses that dependency

How is this different from regular dev work? Most of our projects use index-state to pin versions, which is 100% equivalent to having upper bounds to the versions present in the index at that particular timestamp. Actually it is even stricted than than because commonly one would set upper bounds to the current major version while index-state ends up fixing minor versions too, preventing any update.

I guess my conter-statement is: a major version bump still currently requires a PR to every project to bump the index-state.

* then every one such package will need a release to CHaP

Not a release, a revision.

To give you an example, it took exactly a year for stackage to get random-1.2 because most of downstream users in stackage had upper bounds on it. That fun fact is that out of thousands of packages on Hackage that depended on random only about a dozen were actually broken. Huge waste of developers' time IMHO!

I see what you mean but I am not sure I agree this is a valid point against speculative upper bounds. While stackage was updated to include random-1.2, I have been able to depend on random-1.2 directly (if I really wanted); and upper bounds have guided me to know which dependencies I could use or not. Afterall not a single build plan fits all .

EDIT: Maybe some of this is specific to stackage and I don't understand it? cabal file revisions to relax bounds on random-1.2 can be filed quite quickly and do not require round trip through the package authors (which might be living on a different planet now).

Unless we have a person that will have this as a top responsibility to update all packages in input-output-hk organization and keep track of version updates to all of the packages we depend on so we don't keep using outdated packages, I highly advise against it!

We are brainstorming ideas for a possilble automation approach.

andreabedini commented 2 years ago

@lehins

I made a release of vector-0.13 over 4 months ago on hackage and stackage records only 29 out of 141 packages on stackage that had issues with the new version actually updated their bounds. I suspect it will be months before maintainers of all dependencies will update those damn upper bounds and everyone will be able to use vector-0.13 I am pretty sure only a small percentage of dependencies had actual compatibility issues with the new vector release.

Again, waste of time.

First of all, thanks for pushing vector ahead :) Second, let understand better what you mean here. What is this "wasted time"? Is time wasted because we need to wait? or it is time wasted because someone is doing something that feels non productive?

andreabedini commented 2 years ago

Replying to https://github.com/input-output-hk/cardano-base/pull/325#issuecomment-1292811515 here to keep the discussion contained.

The long term "fix" should be making the library compatible but this requires a new release. What I am insisting on is: users should be able to keep the old version and have it working as it was. Dont' let users build plans break invain.

lehins commented 2 years ago

I believe a smart contract developer would prefer having a working plan with vector version N-1 rather than a build error, and that fixing build errors is a more painful busy-work than making a revision.

A smart contract developer could pin vector to N-1 version in his project if and only if there is a problem. Following up by opening a ticket on the repo with the compatibility problem indicating that it needs fixing.

How is this different from regular dev work?

Updating index-state is only done when we need one or more dependency to be updated. Now if I need, say the same vector, but all of my dependencies have an upper bound, now it is my job to submit a pr that updates bounds. That scenario applies to every single dependency that we use from Hackage! In other words with over restrictive bounds, besides updating index-state we'd have to go around and either ask maintainers or submit PRs ourselves with just bounds updates!

I'll give an example. Let's say cardano-node needs a new version of nothunks that just recently had a small breaking change. What I need to do is submit PRs to five different repos, with only a sinlge change to many packages of updating the upper bound for nothunks, then I need submit revisions for all those updated packages to CHaP and only then I can update index-state for cardano-node. And that process has to be done for every single upper bound update!

I guess my conter-statement is: a major version bump still currently requires a PR to every project to bump the index-state.

Yes, but that is the only thing that is required, a single PR with a one line change.

Not a release, a revision.

It doesn't matter, it is exactly the same process and requires just as much work.

it is time wasted because someone is doing something that feels non productive?

It is time wasted because someone is doing something that is non productive :smile:

thanks for pushing vector ahead

My pleasure :)

We are brainstorming ideas for a possilble automation approach.

A somewhat related question to this discussion. Have we considered adding a CI job to CHaP repo that would build all packages before every PR with a release, which would be required to pass before merging? This would guarantee that all packages in the index work well together and this would avoid all this issues with bounds at least for all inter dependencies on packages inside CHaP

What I am insisting on is: users should be able to keep the old version and have it working as it was

A revision with an addition of upper bound upon a discovery of a compatibility issue solves this problem, just like you did in that PR

michaelpj commented 2 years ago

@lehins oh yes indeed: https://github.com/input-output-hk/cardano-haskell-packages/issues/61

michaelpj commented 2 years ago

I'd like to propose a starting point for writing something: for now, let's just say that packages MAY include a speculative upper bound if they like. I think we agree on the rest, so that will let us say something, and we can argue about whether we should change that to a MUST or MUST NOT later.

michaelpj commented 2 years ago

https://github.com/input-output-hk/cardano-engineering-handbook/pull/37 PTAL

michaelpj commented 1 year ago

Any thoughts on the proposed policy? @coot perhaps you have opinions?

coot commented 1 year ago

I am more on the @andreabedini side; using MAY might be the right compromise for now.

coot commented 1 year ago

It might actually depend per package. For network I'd prefer to have an upper bound, so that whatever the downstream is using is well tested by us (to exclude situation when things do compile but do encounter a runtime error), for aeson or vector I'd be just fine to use a slightly older version.

To make it easier to work with speculative upper bounds we could agree that they are specified in cardano-prelude (although some libraries do not depend on it, e.g. network-mux - for such packages we could leave them without upper bounds).

input-output-hk / cardano-engineering-handbook