leanprover / reservoir

Package registry for Lean/Lake.
https://reservoir.lean-lang.org
Apache License 2.0
16 stars 1 forks source link

Building on "Latest stable" means it is impossible to ensure a package will build #10

Open digama0 opened 8 months ago

digama0 commented 8 months ago

The reservoir build process uses "latest stable" to determine whether to show the green checkmark on the front page. Here "latest stable" means the last release on github as of the start of the reservoir build job. If a package wants to ensure it is always green, it has to make sure it releases a version after the stable is released but before the reservoir build pulls the latest version of the package, which sets up a race condition that is difficult to win reliably. Mathlib can do this to some extent because Scott is responsible for releasing both versions, but there is still the overhead of actually building mathlib in the middle even assuming these are released in immediate succession. For any other project the odds of being able to catch this window is much lower.

We think it should be possible for a package to maintain always-green status as long as they have a build process like bors which structurally ensures no build errors make it to master. The core issue is that "latest stable" is an inherently racy notion, and is the whole reason why the lean-toolchain file exists, to ensure that package updates can proceed atomically and there is no point at which breakage can be observed.

For users browsing Reservoir, they will presumably be interested in looking for projects to use, and the lean version is more of a constraint. Being able to see projects that build on a fixed version is less useful, although still useful if the user has other constraints; but when this happens it is just as likely to be an earlier version than the latest stable. Certainly if the stable was released 10 minutes ago it is not particularly important because no projects have had an opportunity to move to it yet.

Finally, knowing that a project builds on a given version is not necessarily all there is to compatibility. There are also behavioral changes that may be important, and other languages generally have packages explicitly indicate their compatibility. No project should be built on a version of lean for which it declares itself to be incompatible. (This is #8.)

The lean 3 version of this compatibility matrix solves these issues in this way:

This gives users the information they need to select a version for multiple compatible projects, and because it is an opt in mechanism it does not suffer from the race condition issue which makes it impossible to maintain always-green status with the reservoir rule.

We may need to use a different rule than lean-x.y.z branches, but there should be some way for maintainers to point reservoir to the right place, because projects like mathlib spend an inordinate amount of time and effort (and money) on ensuring always-green status and this practice is throwing their work away.

tydeu commented 8 months ago

@digama0 A clarification: Reservoir actually has two different notions of latest release: latest, which is the latest release in general (i.e., a stable or release candidate) and stable which is the latest stable release only (i.e., no release candidates). The front page currently just uses the latest release, not the latest stable. As such, I think in place of "latest stable" you want to say "latest toolchain".

tydeu commented 8 months ago

Interestingly, if Reservoir did use latest stable instead for the front page, then it would also be possible to make a green checkmark as the latest stable is always feature equivalently with the most recent release candidate. Would that be a worthwhile solution for at least part of this issue?

tydeu commented 8 months ago

For the main issue itself, to highlight one point:

No project should be built on a version of lean for which it declares itself to be incompatible. (This is #8.)

A more general question is whether such incompatibilities should be represented with a red cross for that version/toolchain combination. From our discussion, this appeared to be the key point of contention. My opinion is that yes it should be represented by a red cross and yours was that it should not.

(Also, as a minor aside, #8, at least as written, is more specific to the issue of not overwriting successful builds with failed ones and more generally using a version that works with that toolchain in that listing.)

digama0 commented 8 months ago

A clarification: Reservoir actually has two different notions of latest release: latest, which is the latest release in general (i.e., a stable or release candidate) and stable which is the latest stable release only (i.e., no release candidates).

The issue I am highlighting applies equally to both "latest release" and "latest stable". There is a race to release in either case. But you make a good point: It is definitely not appropriate to be using release candidates on the front page of the package registry, if people are shopping for compatible releases RC's are not the best way to get them.

Interestingly, if Reservoir did use latest stable instead for the front page, then it would also be possible to make a green checkmark as the latest stable is always feature equivalently with the most recent release candidate. Would that be a worthwhile solution for at least part of this issue?

That doesn't necessarily work, the build process may require that you use the right version of lean even though they are feature-equivalent, especially if non-built olean files are involved. But the spirit of the idea sounds good: if there was a way to give people a month of lead time to get up to date, then I would be okay with marking them as gold checkmarks if they miss the deadline. But 10 minutes lead time is just not enough. Even 24 hours is only reasonable if you have an actual employee on the job, and most lean projects can't be run like that.

For the main issue itself, to highlight one point:

No project should be built on a version of lean for which it declares itself to be incompatible. (This is https://github.com/leanprover/reservoir/issues/8.)

A more general question is whether such incompatibilities should be represented with a red cross for that version/toolchain combination. From our discussion, this appeared to be the key point of contention. My opinion is that yes it should be represented by a red cross and yours was that it should not.

Yes, I was deliberately not addressing this question in the statement above. If a package says it is incompatible, you should take it at its word and not try to build it. If it says it is compatible, you should try to build and report a red cross if actually it isn't. If it says it is compatible with an older version and it is, well that's the big question.

I think you should summarize compatibility information with a version range, e.g. version "master" is compatible with versions 4.3.0 to 4.4.0-rc1, and this should be represented as a green checkmark as long as the version range is nonempty and it was verified to build on all those versions. Similarly there should be other checkmarks for other versions provided by the library, e.g. version "toolchain/4.2.0" is provided by the library (and the fact that this tag or branch is interesting to reservoir is communicated by smoke signals or something) and it is compatible with version 4.2.0, green checkmark.

The user is capable of looking at their lean version (if indeed they have one; they may be using library versions to choose one) and cross referencing with these version ranges to find a version of the library compatible with their setup. Red crosses are reserved for the situation where the library says it is compatible with a given version and it isn't.

tydeu commented 8 months ago

@digama0 Thank you for taking the time to lay this all out in text. I plan to discuss the design further with the team and the community in the coming weeks. I am not sure if we will reach agreement on all points, but I certainly want to minimize the area of disagreement as much as possible (e.g., if any potential half-measures are undisputed, put them on the todo list and implement them).

nomeata commented 8 months ago

I agree that a project that does reasonably well keep their stuff together shouldn't get a shameful red icon. At the same time I, as a user, would like to be able to differentiate between “builds with some version” and “is up to date with latest stable”, and would probably use two different symbols, both non-shaming positive, but distinct.

Looking more closely, this is actually the case, unless I am mistaken: both get checkmarks, but with different colors. My color perception isn't great, so I didn't see it at first, but now that I do I find this quite reasonable. (If that's the rule indeed)

Mario, in your description of the issue above you didn't point out clearly that these project do get a checkmark, and it's the green color that you care about. Is that really so important?

If it is, would it be a compromise to use a green checkmark always, and add a second icon for “works with latest stable”? But that seems a waste of screen estate compared to using the background color. Or keep the green background but add a black outline. Or something.

(An I missing the point? The color of an icon seems to be a minor point to spend so many words on.)

david-christiansen commented 8 months ago

Using only color to distinguish something like this is not great for accessibility - for people who don't see color the way I do, and for people who don't know to look for the cue (like me and apparently also @nomeata). It seems that some kind of redundant signaling in the shape would be a good idea as well, or having two icons as @nomeata suggests.

digama0 commented 8 months ago

I agree that a project that does reasonably well keep their stuff together shouldn't get a shameful red icon. At the same time I, as a user, would like to be able to differentiate between “builds with some version” and “is up to date with latest stable”, and would probably use two different symbols, both non-shaming positive, but distinct.

This is correct, currently there is vacillation between green and yellow checkmarks on mathlib master for the front page indicator (at least assuming reservoir doesn't have other build issues like #8). I do indeed consider this not good enough, because mathlib is kept up to date and being marked as not up to date is also bad. This happens because the bar for "up to date" is an absurd "as of 0 minutes ago" bar.

digama0 commented 8 months ago

I think the headline checkmark should not be using latest stable, it should just be the last build of the tool with updated dependencies on the version it specifies, similar to Github's headline checkmark. The fact that this is out of date can be seen in the more detailed version list information, but I don't think it requires any acknowledgement on the front page.

Mac has told me that this was part of the design, you are supposed to be able to go to the front page and see everything that works on the latest version. I think that this is not actually the most valuable information for users, even setting aside the issue that it makes the whole ecosystem look bad with a pockmarked record, because the user is probably not on latest stable as of 0 minutes ago, and they might not be on a particular version at all yet, they are just looking for interesting libraries. If a library happens to work on an older version, this may not be a problem at all, and really there is no substitute for actually researching the libraries and compatibility if it is more complex than "there exists a common version".

A library being out of date is something to open an issue about with them, not something to name and shame on the front page.

digama0 commented 8 months ago

This is all compounded by the fact that reservoir is a scraper, not a proper package registry (to which people... register packages). This means it catches all sorts of projects "with their pants down" not necessarily ready to provide a release, so this results in premature bad press for them, with no opt out or control from the package maintainers. (This is #6.) In a normal package registry, if you don't like the policies you can just take your package elsewhere. Reservoir doesn't even let people do that.