servo / rust-url

URL parser for Rust
https://docs.rs/url/
Apache License 2.0
1.27k stars 317 forks source link

MSRV bump 1.67 in v2.5.1 #937

Open tnull opened 3 weeks ago

tnull commented 3 weeks ago

The most recent patch release v2.5.1 bumped the MSRV from 1.56 to 1.67. Unfortunately, this violates a whole chain of downstream projects with a lower MSRV, e.g., reqwest (cc @seanmonstar) and dependents.

Especially since this happened in a patch release, CIs are now failing everywhere down the chain and projects that can't or are not willing to bump their MSRV are now forced to find some kind of workaround.

It would therefore be great if you could reconsider that bump and, e.g., would consider introducing the corresponding changes (I believe in https://github.com/servo/rust-url/pull/923) behind a feature (preferably defaulting to off) so that any downstream projects can actually choose whether they are willing or able to support 1.67 as their MSRV.

Manishearth commented 3 weeks ago

Not going to express a strong opinion on whether rust-url should have updated MSRV, in general for crates near the bottom of lots of deptrees I prefer extreme conservatism for MSRV but it really depends on the project.

In general I do disagree with "changing MSRV is breaking. Fixing it is a cargo update away for end-clients, and Cargo already has functionality for doing msrv-aware dep resolution that you should try using in CI. "CI breaks" is not necessarily a sign of breakage, it can just be a signal of change that you should be aware of.

....However it definitely is nice to avoid the situation where possible and provide mitigations if necessary. Would be nice to explore options.

TheBlueMatt commented 3 weeks ago

Cargo already has functionality for doing msrv-aware dep resolution that you should try using in CI

Sadly, this is only available for relatively new cargos, making it useless for fixing MSRV breaks for those supporting conservative MSRVs :(. Another year or two and it'll become useful.

Manishearth commented 3 weeks ago

FWIW I have held the position "changing MSRV is breaking unless explicitly a part of the contract" for long before that feature existed:

but still, I'd say we should attempt some mitigation here. url is pretty deep in a lot of deptrees

valenting commented 3 weeks ago

@Manishearth how would such a mitigation work? I don't think we can put the idna changes behind a feature flag. Would backing out the changes and releasing 2.5.2 work, then we release 2.6.0 with a bumped MSRV?

epage commented 3 weeks ago

The Cargo semver guidelines recommend treating MSRV as a minor incompatibility, and not a major incompatibility. This could mean bumping minor but we should also keep in mind that, for semver and Cargo, there is almost no distinction between minor and patch. The main reason to bump minor instead of patch is to give flexibility to backport fixes to the pre-MSRV bump version.

Also, the MSRV-aware resolver is not stable yet. It is available on nightly. Whether stable or nightly, using it does not require an MSRV bump. I have been using it myself to help manage my MSRV (like when I dropped the MSRV of my packages back to 1.65).

MSRV testing in CI is aided by checking in your lockfile.

epage commented 3 weeks ago

but still, I'd say we should attempt some mitigation here. url is pretty deep in a lot of deptrees

imo MSRV should be requirements driven. I have not heard of a requirement older than Debian stable which is 1.63. Even that requirement is a bit dubious as we generally view the Debian toolchain as intended for building Debian packages and not intended for end users. 1.67 is over 10 versions behind and is a year and a half old.

how would such a mitigation work?

Is the bump purely from dependencies? Can you drop the version requirements on the icu4x dependencies? For example, dropping to icu_normalizer@1.3.0 will get it down to 1.66. Unfortunately, icu4x only recently started tracking their MSRV so going further down from that would be guess and check. Looks like icu4x's MSRV policy is 4-6 versions. You can have a lower MSRV so long as their lowest MSRV is compatible with yours and you are fine not being able to require the latest features. This could mean delaying this feature completely until the lowest MSRV from icu4x works for you. There is the question of whether icu4x would be willing to lower their MSRV. That would be a question of what all they are using from newer versions and how well they can be mitigated.

tnull commented 3 weeks ago

imo MSRV should be requirements driven. I have not heard of a requirement older than Debian stable which is 1.63. Even that requirement is a bit dubious as we generally view the Debian toolchain as intended for building Debian packages and not intended for end users.

Could you clarify who is 'we' and and how you come to that conclusion? Supporting Debian stable seems like a hard requirement for quite a few users, and given that it's not even two years old rustc 1.63 seems like a reasonable MSRV target that (in my experience) a large part of the Rust ecosystem seem to be able to get behind.

TheBlueMatt commented 3 weeks ago

we generally view the Debian toolchain as intended for building Debian packages and not intended for end users

This seems like a weird conclusion. Until Fedora 34, the Debian toolchain was the only way to accomplish cross-language LTO without building both clang/LLVM and rustc from source. Thus, its in fact the only option for a toolchain that can be used in some build systems. For projects building multi-language systems which want X-lang LTO and want people to be able to build the software themselves, not supporting Debian's stable toolchain is not an option.

Manishearth commented 3 weeks ago

Could you clarify who is 'we' and and how you come to that conclusion?

I believe Ed is talking for the Cargo team here? (That's what I read those comments as)

Supporting Debian stable seems like a hard requirement for quite a few users, and given that it's not even two years old rustc 1.63 seems like a reasonable MSRV target that (in my experience) a large part of the Rust ecosystem seem to be able to get behind.

I do not think the comment was suggesting that Debian stable should not be supported, it was saying that it's unclear if crates need to strictly follow such an MSRV to support the needs of Debian stable. (Debian does not need to pull in the latest version of this crate, and other mitigations are also possible).


@valenting

how would such a mitigation work? I don't think we can put the idna changes behind a feature flag. Would backing out the changes and releasing 2.5.2 work, then we release 2.6.0 with a bumped MSRV?

I was hoping there was a way to keep the old code around as fallback. But there may be other ways.

Switching from patch to minor release would not really change the effects this has on users but it would be slightly clearer at least.

epage commented 3 weeks ago

Yes, I was speaking to a general feel I've gotten from Cargo team discussions. And it was not to be prescriptive (ie not "you shouldn't) but to provide insight when people determine what requirements drive their MSRV policy (ie "should you?").

epage commented 3 weeks ago

I was hoping there was a way to keep the old code around as fallback. But there may be other ways.

Sounds like this was a fairly invasive change so my idea might not work. What I've been experimenting with recently is to delegate the MSRV-dependent code to a dependency. You have two versions, one for the old MSRV and one for the new. They should have compatible APIs and I number the versions for what their MSRV is. The main drawback is that you can't declare through version requirements that you need new versions of each minor release so the API should likely be frozen.

example: https://crates.io/crates/is_terminal_polyfill/versions

tnull commented 3 weeks ago

Yes, I was speaking to a general feel I've gotten from Cargo team discussions. And it was not to be prescriptive (ie not "you shouldn't) but to provide insight when people determine what requirements drive their MSRV policy (ie "should you?").

Fair enough.

I do not think the comment was suggesting that Debian stable should not be supported, it was saying that it's unclear if crates need to strictly follow such an MSRV to support the needs of Debian stable. (Debian does not need to pull in the latest version of this crate, and other mitigations are also possible).

This assumption only holds if all the crate maintainers commit to providing backported (security) fixes for these 'non-latest versions', as otherwise users might get stuck on broken/potentially vulnerable versions. Also given that the MSRV breakage happened in a patch release, there is essentially no option left anyways to provide such backported patches.

DanGould commented 3 weeks ago

This is a problem for me too. I would have liked to see an MSRV break get a minor version bump.

Manishearth commented 3 weeks ago

This is a problem for me too. I would have liked to see an MSRV break get a minor version bump.

This is a good signal, if this would help people we should do it. Major bump is definitely a bigger endeavor and I'm worried about going that far, but minor bump is something we can definitely do.

Manishearth commented 3 weeks ago

@tnull

This assumption only holds if all the crate maintainers commit to providing backported (security) fixes for these 'non-latest versions', [...] Also given that the MSRV breakage happened in a patch release, there is essentially no option left anyways to provide such backported patches.

Debian package maintainers also regularly backport patches themselves, this is not a hard requirement either IMO.

I think Ed is correct in saying that the choices here do not necessarily imply "not supporting" Debian, which is already a manually-maintained not-end-user situation. This does not mean that we shouldn't try to still do things that make things nicer there, but these things aren't blockers for rust-url being used in Debian.

TheBlueMatt commented 3 weeks ago

I don't believe anyone here is lacking the ability to build some Debian rust-url package (if such a package exists). That is distinct from the use-case of seeking to build other software using cargo on Debian (for example for cross-language LTO).

djc commented 2 weeks ago

So:

Given this, what should be the path forward for the url/idna crates post-#923? Maybe it would be better for Mozilla to fork the idna crate? Maybe the ICU4X project can work on decreasing their MSRV/improving compile times?

Manishearth commented 2 weeks ago

As far as I can tell the majority of the compile time hit is in syn so there's not much ICU4X can do here beyond dropping that dependency (highly unlikely). Keeping the old code around and making this a feature flag might be the ideal path forward.

ICU4X is a lot of crates but they're small, it's trying to be modular.

valenting commented 2 weeks ago

I think I'll backout the IDNA changes, release url 2.5.2 with that, and wait for @hsivonen to have a look at the idna issues - unless someone has a better suggestion.

hsivonen commented 1 week ago

So:

  • reqwest (the most popular dependent crate by far) has pinned 2.5.0 because of this issue

I think this characterization makes it look like reqwest did something drastic when what reqwest did was something very mild. I think it's very unfortunate if the above characterization lead to the backout.

reqwest did not pin url in its Cargo.toml. reqwest pinned url in its CI task called msrv.

If someone is using a very old compiler (for whatever reason), it should be up to them to run the appropriate cargo incantations to make the dependencies close enough in vintage to the compiler vintage for stuff to work. AFAICT, what reqwest did with their msrv CI config is expected, normal, and not in it itself a rebuke of the url 2.5.1 release.

The most significant contributor to this concern is syn. More on this below.

This isn't fundamentally a consequence of depending on ICU4X, so I think it makes sense to discuss this as a separate concern in the issue that you've filed.

Maybe it would be better for Mozilla to fork the idna crate?

I have previously gone the route of introducing another Firefox-motivated crate for a topic that already had a pre-existing crate and I got complaints about hurting the ecosystem (though eventually the ecosystem migrated to the crate I wrote). Whichever way someone thinks it's the wrong way.

Maybe the ICU4X project can work on decreasing their MSRV/improving compile times?

AFAICT, the key grievances here are:

  1. The way folks know how to get a toolchain that works with cross-language LTO is Debian, and rustc in Debian is older than ICU4X's MSRV.
  2. If you didn't already have syn in your transitive dependencies, compile times got significantly longer.

These are both general Rust ecosystem issues:

  1. In general with Rust, as time goes on, the MSRV number goes up.
  2. In general with Rust, as time goes on, the probability of your project ending up with syn in its transitive dependencies increases.

The second point may look glib, but proc macros are a part of the Rust ecosystem and for better or worse, the way doing proc macros works in practice involves depending on syn in some manner.

Looking at the Cargo.lock of Firefox, I see 48 crates that depend on syn. The first match (alphabetically) happens to be one that you wrote! Asking you to solve the Rust ecosystem issue that proc macros in practice causing a dependency on a crate that takes a long time to compile is about as unreasonable as suggesting that ICU4X developers solve that problem.

As for MSRV, I think it's unhealthy for the ecosystem to expect crate authors (ICU4X developers or others) to refrain from making use of new language or standard library facilities on Debian timelines. I think it should be up to folks who choose to use a very old compiler to do the work of getting the crates that they pull from crates.io to compile, possibly by doing the work of figuring out which older version of a given crates compiles given the compiler vintage.

(Aside about how old Debian is: The safety-critical embedded industries have a reputation of moving slowly, but not only is the latest TÜV-certified Rust toolchain new enough to deal with an MSRV of 1.67, but the previous one is new enough, too.)

As a matter of the health of the Rust ecosystem, it would be much better to ship clang as a rustup component than to have the crate ecosystem cater to Debian-shipped compilers. But that's out of scope for url, idna, or ICU4X to solve.

Quoting from the OP:

It would therefore be great if you could reconsider that bump and, e.g., would consider introducing the corresponding changes (I believe in #923) behind a feature (preferably defaulting to off) so that any downstream projects can actually choose whether they are willing or able to support 1.67 as their MSRV.

I think it's not at all a given that if there was a Cargo feature (either in url or in idna) controlling whether 1.67-requiring stuff gets pulled in, defaulting to off would be the right call for the ecosystem.

The issue here focuses on changes to compile-time attributes of url (by the way of transitive dependencies via idna) and views those changes negatively. I suggest it is appropriate to consider the changes other attributes as well.

The idna crate change had these other attributes:

  1. The performance got faster (at least for normal use as by url; see #938 about specialized batch processing of domain names without them being part of a URL or similar)
  2. The optimized, debug-info-removed binary size got smaller. (Wasm measured.)
  3. Correctness increased (the implementation of the ContextJ rule was previously missing)
  4. The API got more misuse-resistant (by limiting the transitional mode to API surface marked deprecated and by steering the caller not to turn the ASCII deny list completely off but to use e.g. the WHATWG ASCII deny list when the STD3 ASCII deny list does not work in practice)

An issue like this selects for hearing from folks who are unhappy with the update, but chances are that for users who make at least some effort to go along with the "stability without stagnation" Rust update cycle, the MSRV of 1.67 is a non-issue and faster, smaller, and more correct is preferable over Debian-accommodating MSRV.

For folks whose project already depends on one of the many crates that transitively depend on syn, the syn compile time issue is sunk cost. The syn compile time is, of course, an unhappy new thing for projects that don't already happen to transitively depend on syn, but even some of those might prefer the new attributes of the compilation product over the downside of the compilation itself taking longer.

But back to the Cargo feature and not its default:

It would be bad for non-top-level crates to set the cargo feature, since opting for old compiler rather than output speed, size, and correctness isn't something that an intermediate dependency should choose.

Right now, a Cargo feature on url to choose idna 0.5 over idna 1.0 would not have much of a benefit compared to the top-level Cargo.lock pinning url 2.5.0. In the future, it might be the case that url would get other improvements so that a Cargo feature would allow for getting those improvements while freezing idna to 0.5.

If the request is to be able to continue to get improvents to idna but without depending in ICU4X, is someone for whom very old MSRV or syn avoidance is important going to keep providing non-ICU4X implementations of the improvements? Note that a key internal design decision for idna 1.0 is that instead of there being separately an NFC normalizer and an UTS 46 mapper, UTS 46 data is represented as normalization data, as in ICU4C, so that the mapping and normalization aspects that are spec-wise distinct and implementation-wise fused.

CryZe commented 1 week ago

@hsivonen I've replied to the claim that syn is the reason for the compile times here: https://github.com/servo/rust-url/issues/939#issuecomment-2188641562

tl;dr syn isn't really the problem