unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.
https://icu4x.unicode.org
Other
1.39k stars 179 forks source link

Multiple versions of ICU4X in lockfile via `url` crate #5871

Open sffc opened 4 days ago

sffc commented 4 days ago

Currently we have the url crate in our dependency tree via icu_provider_source, which causes us to pull in multiple copies of icu4x:

icu_provider_source v2.0.0-beta1 (/usr/local/google/home/sffc/projects/icu4x/provider/source)
├── calendrical_calculations v0.1.2 
...
├── icu v2.0.0-beta1 (icu4x/components/icu)
│   ├── icu_normalizer v2.0.0-beta1 (icu4x/components/normalizer) (*)
...
├── ureq v2.10.1
│   ├── url v2.5.4
│   │   ├── idna v1.0.3
│   │   │   ├── idna_adapter v1.2.0
│   │   │   │   ├── icu_normalizer v1.5.0

@Manishearth @hsivonen

sffc commented 4 days ago

Note that this is also in our main Cargo.lock file.

[[package]]
name = "idna_adapter"
version = "1.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "daca1df1c957320b2cf139ac61e7bd64fed304c5040df000a745aa1de3b4ef71"
dependencies = [
 "icu_normalizer 1.5.0",
 "icu_properties 1.5.1",
]
Manishearth commented 3 days ago

It's going to be really hard to do any networking without the url crate, it's basically the only thing out there.

One of the reasons there was a much higher bar to make such changes to the crate was that there's no well integrated alternative.

We're not going to be able to drop that dep.

hsivonen commented 2 days ago

ICU4X already is optional for url.

We could have a cargo make target that runs cargo update; cargo update -p idna_adapter --precise 1.1.0 (or even cargo update; cargo update -p idna_adapter --precise 1.0.0 if we only connect by host names that are known to be ASCII).

(Additionally, we only have this problem during the time window when icu_normalizer in the ICU4X repo is not API-compatible and semver-compatible with published icu_normalizer.)

hsivonen commented 2 days ago

Also, is this really a problem? Is building a non-ICU4X normalizer as part of building ICU4X better than building an older release of icu_normalizer? (Anyway, if we connect to known-ASCII hostnames only, we can do cargo update; cargo update -p idna_adapter --precise 1.0.0 and not depend on any normalizer for this step.)

hsivonen commented 2 days ago

And one option is to use a git patch for idna_adapter in ICU4X's Cargo.toml: https://github.com/hsivonen/idna_adapter/commit/76e6d7a197e5c12e92edf4e038f6110ade35e5da

(I thought @robertbastian already did this but with a different revision of idna_adapter previously, so I'm suprisised that this issue got filed.)

sffc commented 2 days ago

Also, is this really a problem?

It's not really a problem I guess; it was just surprising to me to find icu4x 1.5 in the dependencies of icu4x 2.0.

Patching the lockfile seems like an okay idea, or we could leave it.

Manishearth commented 2 days ago

ICU4X is still large, I'd rather have the smaller legacy dep, atleast until the adapter gets updated to 2.0.

Manishearth commented 2 days ago

Anyway, if we connect to known-ASCII hostnames only, we can do cargo update; cargo update -p idna_adapter --precise 1.0.0 and not depend on any normalizer for this step

Yep. Open to that.