Closed Mark-Simulacrum closed 3 weeks ago
We discussed this regression in today's @rust-lang/libs-api team meeting, and agree there's nothing to change on Rust's end. Those repos that have an old version of time
in a lockfile will need to update that.
I would love if someone were interested in cataloguing all the "type annotations needed" type inference breakages resulting from the addition of a trait impl through the history of the standard library, and think about building static analysis that could forewarn about code that is at risk of becoming ambiguous (e.g. a += b.as_ref()
). It may be something that widely used crates want to enforce in CI.
Backward compatibility is difficult =(
So what exactly is the solution to resolve this?
@ayushgupta0610 Answer is here: https://github.com/rust-lang/rust/issues/127343#issuecomment-2218261296
I am not sure how @rust-lang/libs-api can look at 5400 regressions and say "eh, that's fine".
Is it because it was only one line?
Is it because Mark said "eh, probably nothing to do"?
The meeting notes from July 9th, 2024 include a recommendation for a process change regarding inference regressions to be more cautious, to revert and then check, regarding such changes which do not have a crater run.
That process change has not been written down anywhere.
This issue has not been resolved until it has or the team has rejected even that.
The apparent heuristic was "more crates than time"?
https://github.com/rust-lang/rust/issues/125319#issuecomment-2203697058
Sorry but that's nonsensical, that's like saying we should ship changes which break proc-macro2, because it's "just one crate".
There are many possible ways, like future compatibility warnings, that could be used to help migrate the ecosystem... gently... to a new version over time (heh), even if we do the break anyways. As far as I am aware, T-libs-api looked into none of them.
They are not too challenging to implement, and for something on time's scale of impact, it is warranted. Because ultimately, a 3 month cycle is not enough for this when it comes to changes that have a big impact and a minimal gain, right down the middle of the ecosystem.
We have effectively by-proxy harassed the maintainers of time and even some distros because we couldn't bring ourselves to land a git revert
. And now we are at the edge of doing nothing, not even implementing better process, effectively, about this.
Perhaps it might be a bit late for some changes now that it's on stable, but I've seen us say "we should do this" and let it slip by before. It's worst with issues where it's uncomfortable to think about.
Like big mistakes.
It may be a "mere" inference regression, but this impact has been big enough that people are calling it "rust 2". Obviously, it ships with none of the advantages of actually doing a deliberate semver major break. Calling it "rust 2" is a bit hyperbolic, but people tend to say such things to express a legitimate feeling that is hard to convey in ordinary or technical language.
Like a loss of trust.
For an idea of the kind of impact this regression has had:
In NixOS, unstable builds (both local and on CI) now fail because of this, and fixing that will require new versions of each of the affected packages.
The problem with that though is that e.g. cargo-outdated
hasn't seen any upstream maintenance for months now, which means that unless a 3rd party is able to step up, the NixOS derivation for cargo-outdated
will remain broken.
I picked on cargo-outdated
here but it's not the only rust crate that now has become effectively uncompileable... because of a change to rustc
.
A lint for easily breakable inference items would indeed help. Another long-term idea would be inferred-types.lock
, a lock file that would contain types inferred for the build. (Exactly in what format, seems like a challenging problem, but that doesn't sound like an impossible problem.) That would help keeping unmaintained or slowly maintained binaries from breaking with new compilers with type inference changes.
cc @rust-lang/lang
...just for our visibility.
Is some sort of special-case compiler hack feasible here to keep the old inference working? Possibly only for edition < 2024.
I know @compiler-errors mentioned something along those lines when adding float From
impls caused regressions, not sure how difficult it would be here.
Is some sort of special-case compiler hack feasible here to keep the old inference working? Possibly only for edition < 2024.
Idea: All trait impls in the standard library have a #[stable(since = "1.X.0")]
attribute which indicates when the trait impl was added. We could add a compiler flag -Z ignore-new-trait-impls-after=1.Y.0
which would cause the trait resolver to ignore any new trait impls added after the specified version. This version could then be recorded in Cargo.lock and passed to rustc by Cargo.
A data point: in $WORK_PROJECT
we can’t update time
, because that forces us to update serde
, and new (patch) versions of serde
change serialization logic, resulting in differing outputs of quick_xml
. As far as I’m aware, it’s impossible to have two differing “patch” versions of a single crate in build graph.
(I’m not sure why serde
consistently does behaviour changes in patch releases, but I can’t affect this)
I appreciate everyone's enthusiasm, but unless we want to discuss a 1.80.2 that reverts this change, technical solutions for how to address this concrete inference regression are somewhat off-topic at this point.
More specifically, T-libs-api, in this case, decided that it was okay to accept this major inference regression and ask that someone else put in such work, in that order^1. They could have waited for someone else to put in such work, and only then accepted the regression. They did not. Now we must explain why this issue remains unresolved if it is to continue being open, and the key reasons I have reopened this issue are
The technical solutions cannot matter anymore (or if they do matter, they matter in a way beyond this issue) because no matter what, it will remain some kind of regression, since no mitigation can travel back in time and e.g. check in lockfiles to repos before it was implemented. A solution which does not require such will likely have other problems instead. So regardless of what solutions are available or become available, they are futile, unless a future trait implementation that causes another major inference regression is blocked on these mitigations being implemented and allowed time to take effect. It may, arguably, not be T-libs-api's job to build or improve the relevant tools, but it is their job to see to it that when those have to land first in order for a change to be viable, that they have actually landed first.
Any technical solution that is actually relevant to this regression, and about as speedy to implement as the aforementioned revert, should be posted as a pull request.
[^0]: which may require rereading RFC 1105 until we have internalized this paragraph: "Although this general policy allows some (very limited) breakage in minor releases, it is not a license to make these changes blindly. The breakage that this RFC permits, aside from being very simple to fix, is also unlikely to occur often in practice. The RFC will discuss measures that should be employed in the standard library to ensure that even these minor forms of breakage do not cause widespread pain in the ecosystem."
@workingjubilee
Many of the affected crates are unmaintained and probably won't have been updated even a year from now. Even for those crates which have been updated, someone might want to build an older version a year or 5 or 10 from now, perhaps in order to bisect some old regression.
Therefore:
I appreciate everyone's enthusiasm, but unless we want to discuss a 1.80.2 that reverts this change, technical solutions for how to address this concrete inference regression are somewhat off-topic at this point.
It doesn't have to be 1.80.2. Even if a solution doesn't land until 1.81 or 1.82 or 1.83, it would still be beneficial in the long run.
A solution which does not require [added content in lockfiles] will likely have other problems instead
I agree that a forward-looking solution to this kind of breakage should probably involve lockfiles. And such a solution ought to be pursued. But independently of that, a one-off hack to address this specific regression would also be beneficial.
For example, what if Cargo had a hardcoded list of crate versions with known issues, and corresponding patches to apply before compilation? Such an approach would not scale, but it could work for messy ecosystem-wide issues like this one.
Would it be worth rustc implementing “quirks” (like https://github.com/WebKit/WebKit/blob/48ed5a35e74335c3c98f51320d32e5ee29c6b3a5/Source/WebCore/page/Quirks.cpp) to work around breakage in widely used packages? if crate.name == foo && crate.version == bar { unbreak_inference(); }
. I think this should be used sparingly if at all, but perhaps it is probably preferably to widespread breakage.
@rben01 at that point, it'd be better to rollback and remove the new impl
instead while we work on a better mechanism. Also, rustc
doesn't know about crate versions, so we don't even have enough info to do that today even if it were decided to be a good idea. We have had checks like that, like proc_macro_back_compat
, but we really don't want to get in the habit of doing that.
From my understanding, rustc should only make breaking changes outside of editions when it absolutely necessary to fix a critical bug, e.g. a soundness one. Being able to compile old Rust code with newer compilers on the same old edition is a crucial property - Rust projects should be finish-able so that one can leave a good piece of code be and not need to update it further while still remaining fully useful.
Since some projects may pin a Rust major version, I do think a fix for this issue should be published as Rust 1.80.x so that in the future 1.80 will not be known as a cursed version that breaks many old crates.
From my understanding, rustc should only make breaking changes outside of editions when it absolutely necessary to fix a critical bug, e.g. a soundness one.
That is not our current policy. See: https://rust-lang.github.io/rfcs/1105-api-evolution.html
(Maybe we'd all be better off if the rust stability rule was the one that overly pushy advocates sometimes like to pretend it is: "the rust compiler never ever ever breaks old code".)
This does not mean that the standard library should be completely free to make non-semver-breaking changes; there are sometimes still risks of ecosystem pain that need to be taken into account. Rather, this RFC makes explicit an initial set of changes that absolutely cannot be made without a semver bump.
@juntyr it is not the "absolutely cannot be made" rule that was broken here.
As above:
I am not sure how @rust-lang/libs-api can look at 5400 regressions and say "eh, that's fine".
It may be a "mere" inference regression, but this impact has been big enough that people are calling it "rust 2". Obviously, it ships with none of the advantages of actually doing a deliberate semver major break. Calling it "rust 2" is a bit hyperbolic, but people tend to say such things to express a legitimate feeling that is hard to convey in ordinary or technical language.
It is the "risks of ecosystem pain that need to be taken into account" rule that was vehemently broken and ruined a lot of people's day. This is made worse, not better, by a policy that says it's permissible to make large ecosystem-breaking changes once every 1.5 months rather than once a year, while simultaneously end-of-lifing all versions of the compiler without the regression, because "only certain types of regressions fall under the stability policy" but also "old stuff bad. Damn the torpedoes, full speed ahead!"
5400 crates is about 2-3 orders of magnitude more broken crates than we typically accept when compiler does breaking changes.
During the last API meeting #100551 was cited as another example of breakage that had the same order of magnitude in terms of crate counts, though we're not sure why this one is getting more attention than that one. Crate count may not be the right metric.
@the8472 If I recall correctly the socket2
situation, the breakage was for tokio
0.1 and 0.2, at a time when 1.0 had existed for 2 years. In contrast, time
0.3.36 had existed merely for weeks.
I'm aware, nevertheless the crate count is still that high during the final beta crater, not just during feature development. Perhaps the age/activity distribution of the affected crates was different. But that suggests crate count is not a good metric.
socket2
's fault for unsafely bypassing stability guarantees. It would have been within the API evolution guidelines even if it broke every crate. Relying on inference is different since that has explicit guidelines to try and minimize breakage.EDIT: and despite this, the PR that lead to #100551 was delayed for almost two years to give time for socket2
users to migrate to newer versions.
Closing since regression is mentioned in the release notes
Mostly reverse-dependencies of the
time
crate (before https://github.com/time-rs/time/commit/5b0c627366babea1636b35db641c0fec964ddbd1, released in 0.3.35). My sense is there's probably nothing to do here, but lots of crates have older version of time in Cargo.lock. Nominating for libs-api to approve the wide spectrum of regressions. (Caused by https://github.com/rust-lang/rust/pull/99969).