unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.
https://icu4x.unicode.org
Other
1.34k stars 174 forks source link

Reduce dependencies in datagen #3121

Closed Manishearth closed 1 year ago

Manishearth commented 1 year ago

So far we've not been careful about datagen deps. However, to be able to include datagen in a vendored tree, I'd like to have options for minimizing the dependencies.

A couple things that can be optimized:

robertbastian commented 1 year ago

cached_path has quite a lot of dependencies for how we're using it, we could do this ourselves with just reqwest.

robertbastian commented 1 year ago

The zip dep can be dropped if you're providing unzipped paths

We'd have to encapsulate this on the API somehow, i.e. expose a with_cldr_dir and with_cldr_zip which would be cfg-ed. This isn't great ergonomically. Is the zip crate something that's hard to vendor/not generally available?

Manishearth commented 1 year ago

cached_path has quite a lot of dependencies for how we're using it, we could do this ourselves with just reqwest.

Yeah but I don't want to pull in reqwest either.

A lot of build systems have a "no hitting the network" rule for build time stuff (both Firefox and Google3 do this).

We'd have to encapsulate this on the API somehow, i.e. expose a with_cldr_dir and with_cldr_zip which would be cfg-ed. This isn't great ergonomically. Is the zip crate something that's hard to vendor/not generally available?

Eh, I don't feel strongly about dropping zip, this is just something that can be explored. We could also just detect if it's a zip or a directory with the current flag.

Manishearth commented 1 year ago

"generally available" isn't as much an issue as "do i want to have to deal with version bumps and stuff here", Google3 already has zip vendored but if we can avoid the dep I'd like to

robertbastian commented 1 year ago

Yeah but I don't want to pull in reqwest either.

reqwest is a fairly common crate and more likely to be vendored already, so I think it's still useful to move from cached_path to reqwest.

A lot of build systems have a "no hitting the network" rule for build time stuff (both Firefox and Google3 do this).

Well, we can build with reqwest but not hit the network, or we can hit the network with just std::net, so I think this argument is orthogonal.

We could also just detect if it's a zip or a directory with the current flag.

I don't like panic!("This feature needs to be enabled")

Every avoided dep is an additional feature for the matrix...

Manishearth commented 1 year ago

reqwest is a fairly common crate and more likely to be vendored already, so I think it's still useful to move from cached_path to reqwest.

Not in Google3. And it pulls in the entire hyper/http cinematic universe.

Actually most large hybrid projects I've seen don't use the Rust networking stack because they have their own and they want everything to go through one place.

I don't like panic!("This feature needs to be enabled")

I think this is pretty constraining; I'm fine with that never happening in the default build but I think it's an important tool for the lean build.

TBH I'd also like to reintroduce the ability to disable experimental components.

sffc commented 1 year ago

Just want to push back on the reqwest part of this conversation: we moved away from reqwest because it didn't give a good user experience. cached_path is great because it gives us progress bars, but unfortunately we disabled those last month because of a change that started doing downloads on threads, so the UI was getting screwed up. The other thing cached_path gives us is an easy way to make sure the file is downloaded exactly once, refreshed when necessary, and saved in a safe cache directory. This is all logic we can write by hand, much of which we had in the reqwest days. So while I'm not opposed to restoring reqwest, we should think about where we went wrong in our judgement of cached_path being a good successor. In other words, let's discuss and get alignment before making a PR changing how we download zip files.

sffc commented 1 year ago

Regarding this:

I don't like panic!("This feature needs to be enabled")

As mentioned in the other thread, I see this as being more of "making an optional datagen flag be required". In this case, we're making --cldr-root and --icuexport-root be required. I think that's totally fine. We could even modify the clap argument parsing to reflect this if we think it would improve DX.

robertbastian commented 1 year ago

That won't work for zip/no zip as those flags both accept zip files

Manishearth commented 1 year ago

We can have those flags always be capable of accepting a directory, and change the help text if zip is disabled.

sffc commented 1 year ago

Discussed with @Manishearth @sffc @robertbastian:

Manishearth commented 1 year ago

Here's the full list of new deps after we remove wasmer and cached-path, followed by the cargo tree output

adler, autocfg, bincode, block-buffer, byteorder, bzip2, bzip2-sys, cc, cfg-if, chrono, cobs, cpufeatures, crc32fast, crlify, crossbeam-channel, crossbeam-deque, crossbeam-epoch, crossbeam-utils, crypto-common, databake, databake-derive, digest, elsa, erased-serde, flate2, generic-array, hash32, heapless, iana-time-zone, icu_codepointtrie_builder, icu_compactdecimal, icu_datagen, icu_provider_blob, icu_provider_fs, itertools, itoa, libc, libm, lock_api, log, matrixmultiply, memoffset, miniz_oxide, ndarray, num-complex, num-integer, num-traits, num_cpus, pkg-config, postcard, rawpointer, rayon, rayon-core, regex-syntax, rust-format, rustc_version, ryu, scopeguard, semver, serde-aux, serde-json-core, serde_json, sha2, spin, thiserror, thiserror-impl, time, toml, typenum, version_check, zip, 
`cargo tree` output ``` $ cargo tree -p icu_datagen -e no-dev icu_datagen v1.1.2 (/home/manishearth/dev/icu4x/provider/datagen) ├── clap v2.34.0 │ ├── ansi_term v0.12.1 │ ├── atty v0.2.14 │ │ └── libc v0.2.126 │ ├── bitflags v1.3.2 │ ├── strsim v0.8.0 │ ├── textwrap v0.11.0 │ │ └── unicode-width v0.1.10 │ ├── unicode-width v0.1.10 │ └── vec_map v0.8.2 ├── crlify v1.0.1 (/home/manishearth/dev/icu4x/utils/crlify) ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) │ ├── databake-derive v0.1.3 (proc-macro) (/home/manishearth/dev/icu4x/utils/databake/derive) │ │ ├── proc-macro2 v1.0.51 │ │ │ └── unicode-ident v1.0.6 │ │ ├── quote v1.0.23 │ │ │ └── proc-macro2 v1.0.51 (*) │ │ ├── syn v1.0.107 │ │ │ ├── proc-macro2 v1.0.51 (*) │ │ │ ├── quote v1.0.23 (*) │ │ │ └── unicode-ident v1.0.6 │ │ └── synstructure v0.12.6 │ │ ├── proc-macro2 v1.0.51 (*) │ │ ├── quote v1.0.23 (*) │ │ ├── syn v1.0.107 (*) │ │ └── unicode-xid v0.2.4 │ ├── proc-macro2 v1.0.51 (*) │ ├── quote v1.0.23 (*) │ └── syn v1.0.107 │ ├── proc-macro2 v1.0.51 (*) │ ├── quote v1.0.23 (*) │ └── unicode-ident v1.0.6 ├── displaydoc v0.2.3 (proc-macro) │ ├── proc-macro2 v1.0.51 (*) │ ├── quote v1.0.23 (*) │ └── syn v1.0.107 (*) ├── elsa v1.7.0 │ └── stable_deref_trait v1.2.0 ├── eyre v0.6.8 │ ├── indenter v0.3.3 │ └── once_cell v1.17.0 ├── icu_calendar v1.1.0 (/home/manishearth/dev/icu4x/components/calendar) │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ ├── displaydoc v0.2.3 (proc-macro) (*) │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) │ │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ │ ├── displaydoc v0.2.3 (proc-macro) (*) │ │ ├── litemap v0.6.1 (/home/manishearth/dev/icu4x/utils/litemap) │ │ │ └── serde v1.0.152 │ │ │ └── serde_derive v1.0.152 (proc-macro) │ │ │ ├── proc-macro2 v1.0.51 (*) │ │ │ ├── quote v1.0.23 (*) │ │ │ └── syn v1.0.107 (*) │ │ ├── serde v1.0.152 (*) │ │ ├── tinystr v0.7.1 (/home/manishearth/dev/icu4x/utils/tinystr) │ │ │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ │ │ ├── displaydoc v0.2.3 (proc-macro) (*) │ │ │ ├── serde v1.0.152 (*) │ │ │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) │ │ │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ │ │ ├── serde v1.0.152 (*) │ │ │ ├── yoke v0.7.0 (/home/manishearth/dev/icu4x/utils/yoke) │ │ │ │ ├── serde v1.0.152 (*) │ │ │ │ ├── stable_deref_trait v1.2.0 │ │ │ │ ├── yoke-derive v0.7.0 (proc-macro) (/home/manishearth/dev/icu4x/utils/yoke/derive) │ │ │ │ │ ├── proc-macro2 v1.0.51 (*) │ │ │ │ │ ├── quote v1.0.23 (*) │ │ │ │ │ ├── syn v1.0.107 (*) │ │ │ │ │ └── synstructure v0.12.6 (*) │ │ │ │ └── zerofrom v0.1.1 (/home/manishearth/dev/icu4x/utils/zerofrom) │ │ │ │ └── zerofrom-derive v0.1.1 (proc-macro) (/home/manishearth/dev/icu4x/utils/zerofrom/derive) │ │ │ │ ├── proc-macro2 v1.0.51 (*) │ │ │ │ ├── quote v1.0.23 (*) │ │ │ │ ├── syn v1.0.107 (*) │ │ │ │ └── synstructure v0.12.6 (*) │ │ │ ├── zerofrom v0.1.1 (/home/manishearth/dev/icu4x/utils/zerofrom) (*) │ │ │ └── zerovec-derive v0.9.3 (proc-macro) (/home/manishearth/dev/icu4x/utils/zerovec/derive) │ │ │ ├── proc-macro2 v1.0.51 (*) │ │ │ ├── quote v1.0.23 (*) │ │ │ ├── syn v1.0.107 (*) │ │ │ └── synstructure v0.12.6 (*) │ │ ├── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) │ │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) │ │ ├── bincode v1.3.3 │ │ │ └── serde v1.0.152 (*) │ │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ │ ├── displaydoc v0.2.3 (proc-macro) (*) │ │ ├── erased-serde v0.3.24 │ │ │ └── serde v1.0.152 (*) │ │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) │ │ ├── icu_provider_macros v1.1.0 (proc-macro) (/home/manishearth/dev/icu4x/provider/macros) │ │ │ ├── proc-macro2 v1.0.51 (*) │ │ │ ├── quote v1.0.23 (*) │ │ │ └── syn v1.0.107 (*) │ │ ├── log v0.4.17 │ │ │ └── cfg-if v1.0.0 │ │ ├── postcard v1.0.2 │ │ │ ├── cobs v0.2.3 │ │ │ ├── heapless v0.7.16 │ │ │ │ ├── hash32 v0.2.1 │ │ │ │ │ └── byteorder v1.4.3 │ │ │ │ ├── serde v1.0.152 (*) │ │ │ │ ├── spin v0.9.5 │ │ │ │ │ └── lock_api v0.4.9 │ │ │ │ │ └── scopeguard v1.1.0 │ │ │ │ │ [build-dependencies] │ │ │ │ │ └── autocfg v1.1.0 │ │ │ │ └── stable_deref_trait v1.2.0 │ │ │ │ [build-dependencies] │ │ │ │ └── rustc_version v0.4.0 │ │ │ │ └── semver v1.0.16 │ │ │ └── serde v1.0.152 (*) │ │ ├── serde v1.0.152 (*) │ │ ├── serde_json v1.0.93 │ │ │ ├── itoa v1.0.5 │ │ │ ├── ryu v1.0.12 │ │ │ └── serde v1.0.152 (*) │ │ ├── stable_deref_trait v1.2.0 │ │ ├── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) │ │ ├── yoke v0.7.0 (/home/manishearth/dev/icu4x/utils/yoke) (*) │ │ ├── zerofrom v0.1.1 (/home/manishearth/dev/icu4x/utils/zerofrom) (*) │ │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) │ ├── serde v1.0.152 (*) │ ├── tinystr v0.7.1 (/home/manishearth/dev/icu4x/utils/tinystr) (*) │ ├── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) ├── icu_casemapping v0.7.1 (/home/manishearth/dev/icu4x/experimental/casemapping) │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ ├── displaydoc v0.2.3 (proc-macro) (*) │ ├── icu_collections v1.1.0 (/home/manishearth/dev/icu4x/components/collections) │ │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ │ ├── displaydoc v0.2.3 (proc-macro) (*) │ │ ├── serde v1.0.152 (*) │ │ ├── yoke v0.7.0 (/home/manishearth/dev/icu4x/utils/yoke) (*) │ │ ├── zerofrom v0.1.1 (/home/manishearth/dev/icu4x/utils/zerofrom) (*) │ │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── serde v1.0.152 (*) │ ├── yoke v0.7.0 (/home/manishearth/dev/icu4x/utils/yoke) (*) │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) ├── icu_codepointtrie_builder v0.3.4 (/home/manishearth/dev/icu4x/components/collections/codepointtrie_builder) │ ├── icu_collections v1.1.0 (/home/manishearth/dev/icu4x/components/collections) (*) │ ├── lazy_static v1.4.0 │ └── toml v0.5.11 │ └── serde v1.0.152 (*) ├── icu_collator v1.1.0 (/home/manishearth/dev/icu4x/components/collator) │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ ├── displaydoc v0.2.3 (proc-macro) (*) │ ├── icu_collections v1.1.0 (/home/manishearth/dev/icu4x/components/collections) (*) │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) │ ├── icu_normalizer v1.1.0 (/home/manishearth/dev/icu4x/components/normalizer) │ │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ │ ├── displaydoc v0.2.3 (proc-macro) (*) │ │ ├── icu_collections v1.1.0 (/home/manishearth/dev/icu4x/components/collections) (*) │ │ ├── icu_properties v1.1.0 (/home/manishearth/dev/icu4x/components/properties) │ │ │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ │ │ ├── displaydoc v0.2.3 (proc-macro) (*) │ │ │ ├── icu_collections v1.1.0 (/home/manishearth/dev/icu4x/components/collections) (*) │ │ │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ │ │ ├── serde v1.0.152 (*) │ │ │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) │ │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ │ ├── serde v1.0.152 (*) │ │ ├── smallvec v1.10.0 │ │ │ └── serde v1.0.152 (*) │ │ ├── utf16_iter v1.0.4 │ │ ├── utf8_iter v1.0.3 │ │ ├── write16 v1.0.0 │ │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) │ ├── icu_properties v1.1.0 (/home/manishearth/dev/icu4x/components/properties) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── serde v1.0.152 (*) │ ├── smallvec v1.10.0 (*) │ ├── utf16_iter v1.0.4 │ ├── utf8_iter v1.0.3 │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) ├── icu_collections v1.1.0 (/home/manishearth/dev/icu4x/components/collections) (*) ├── icu_compactdecimal v0.1.0 (/home/manishearth/dev/icu4x/experimental/compactdecimal) │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ ├── displaydoc v0.2.3 (proc-macro) (*) │ ├── fixed_decimal v0.5.2 (/home/manishearth/dev/icu4x/utils/fixed_decimal) │ │ ├── displaydoc v0.2.3 (proc-macro) (*) │ │ ├── smallvec v1.10.0 (*) │ │ └── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) │ ├── icu_decimal v1.1.0 (/home/manishearth/dev/icu4x/components/decimal) │ │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ │ ├── displaydoc v0.2.3 (proc-macro) (*) │ │ ├── fixed_decimal v0.5.2 (/home/manishearth/dev/icu4x/utils/fixed_decimal) (*) │ │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) │ │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ │ ├── serde v1.0.152 (*) │ │ └── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) │ ├── icu_plurals v1.1.0 (/home/manishearth/dev/icu4x/components/plurals) │ │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ │ ├── displaydoc v0.2.3 (proc-macro) (*) │ │ ├── fixed_decimal v0.5.2 (/home/manishearth/dev/icu4x/utils/fixed_decimal) (*) │ │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) │ │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ │ ├── serde v1.0.152 (*) │ │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── serde v1.0.152 (*) │ ├── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) ├── icu_datetime v1.1.0 (/home/manishearth/dev/icu4x/components/datetime) │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ ├── displaydoc v0.2.3 (proc-macro) (*) │ ├── either v1.8.1 │ ├── fixed_decimal v0.5.2 (/home/manishearth/dev/icu4x/utils/fixed_decimal) (*) │ ├── icu_calendar v1.1.0 (/home/manishearth/dev/icu4x/components/calendar) (*) │ ├── icu_decimal v1.1.0 (/home/manishearth/dev/icu4x/components/decimal) (*) │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) │ ├── icu_plurals v1.1.0 (/home/manishearth/dev/icu4x/components/plurals) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── icu_timezone v1.1.0 (/home/manishearth/dev/icu4x/components/timezone) │ │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ │ ├── displaydoc v0.2.3 (proc-macro) (*) │ │ ├── icu_calendar v1.1.0 (/home/manishearth/dev/icu4x/components/calendar) (*) │ │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) │ │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ │ ├── serde v1.0.152 (*) │ │ ├── tinystr v0.7.1 (/home/manishearth/dev/icu4x/utils/tinystr) (*) │ │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) │ ├── litemap v0.6.1 (/home/manishearth/dev/icu4x/utils/litemap) (*) │ ├── serde v1.0.152 (*) │ ├── smallvec v1.10.0 (*) │ ├── tinystr v0.7.1 (/home/manishearth/dev/icu4x/utils/tinystr) (*) │ ├── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) ├── icu_decimal v1.1.0 (/home/manishearth/dev/icu4x/components/decimal) (*) ├── icu_displaynames v0.8.0 (/home/manishearth/dev/icu4x/experimental/displaynames) │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ ├── icu_collections v1.1.0 (/home/manishearth/dev/icu4x/components/collections) (*) │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── serde v1.0.152 (*) │ ├── tinystr v0.7.1 (/home/manishearth/dev/icu4x/utils/tinystr) (*) │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) ├── icu_list v1.1.0 (/home/manishearth/dev/icu4x/components/list) │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ ├── deduplicating_array v0.1.3 (/home/manishearth/dev/icu4x/utils/deduplicating_array) │ │ └── serde v1.0.152 (*) │ ├── displaydoc v0.2.3 (proc-macro) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── regex-automata v0.2.0 │ │ ├── memchr v2.5.0 │ │ └── regex-syntax v0.6.28 │ ├── serde v1.0.152 (*) │ └── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) ├── icu_locid_transform v1.1.0 (/home/manishearth/dev/icu4x/components/locid_transform) │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ ├── displaydoc v0.2.3 (proc-macro) (*) │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── serde v1.0.152 (*) │ ├── tinystr v0.7.1 (/home/manishearth/dev/icu4x/utils/tinystr) (*) │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) ├── icu_normalizer v1.1.0 (/home/manishearth/dev/icu4x/components/normalizer) (*) ├── icu_plurals v1.1.0 (/home/manishearth/dev/icu4x/components/plurals) (*) ├── icu_properties v1.1.0 (/home/manishearth/dev/icu4x/components/properties) (*) ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) ├── icu_provider_adapters v1.1.0 (/home/manishearth/dev/icu4x/provider/adapters) │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── serde v1.0.152 (*) │ ├── tinystr v0.7.1 (/home/manishearth/dev/icu4x/utils/tinystr) (*) │ ├── yoke v0.7.0 (/home/manishearth/dev/icu4x/utils/yoke) (*) │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) ├── icu_provider_blob v1.1.0 (/home/manishearth/dev/icu4x/provider/blob) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── log v0.4.17 (*) │ ├── postcard v1.0.2 (*) │ ├── serde v1.0.152 (*) │ ├── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) │ ├── yoke v0.7.0 (/home/manishearth/dev/icu4x/utils/yoke) (*) │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) ├── icu_provider_fs v1.1.0 (/home/manishearth/dev/icu4x/provider/fs) │ ├── bincode v1.3.3 (*) │ ├── crlify v1.0.1 (/home/manishearth/dev/icu4x/utils/crlify) │ ├── displaydoc v0.2.3 (proc-macro) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── log v0.4.17 (*) │ ├── postcard v1.0.2 (*) │ ├── serde v1.0.152 (*) │ ├── serde-json-core v0.4.0 │ │ ├── ryu v1.0.12 │ │ └── serde v1.0.152 (*) │ ├── serde_json v1.0.93 (*) │ ├── sha2 v0.10.6 │ │ ├── cfg-if v1.0.0 │ │ ├── cpufeatures v0.2.5 │ │ └── digest v0.10.6 │ │ ├── block-buffer v0.10.3 │ │ │ └── generic-array v0.14.6 │ │ │ └── typenum v1.16.0 │ │ │ [build-dependencies] │ │ │ └── version_check v0.9.4 │ │ └── crypto-common v0.1.6 │ │ ├── generic-array v0.14.6 (*) │ │ └── typenum v1.16.0 │ └── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) ├── icu_relativetime v0.1.0 (/home/manishearth/dev/icu4x/experimental/relativetime) │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ ├── displaydoc v0.2.3 (proc-macro) (*) │ ├── fixed_decimal v0.5.2 (/home/manishearth/dev/icu4x/utils/fixed_decimal) (*) │ ├── icu_decimal v1.1.0 (/home/manishearth/dev/icu4x/components/decimal) (*) │ ├── icu_plurals v1.1.0 (/home/manishearth/dev/icu4x/components/plurals) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── serde v1.0.152 (*) │ ├── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) ├── icu_segmenter v0.8.0 (/home/manishearth/dev/icu4x/experimental/segmenter) │ ├── databake v0.1.3 (/home/manishearth/dev/icu4x/utils/databake) (*) │ ├── displaydoc v0.2.3 (proc-macro) (*) │ ├── icu_collections v1.1.0 (/home/manishearth/dev/icu4x/components/collections) (*) │ ├── icu_locid v1.1.0 (/home/manishearth/dev/icu4x/components/locid) (*) │ ├── icu_provider v1.1.0 (/home/manishearth/dev/icu4x/provider/core) (*) │ ├── ndarray v0.15.6 │ │ ├── matrixmultiply v0.3.2 │ │ │ └── rawpointer v0.2.1 │ │ ├── num-complex v0.4.3 │ │ │ └── num-traits v0.2.15 │ │ │ └── libm v0.2.6 │ │ │ [build-dependencies] │ │ │ └── autocfg v1.1.0 │ │ ├── num-integer v0.1.45 │ │ │ └── num-traits v0.2.15 (*) │ │ │ [build-dependencies] │ │ │ └── autocfg v1.1.0 │ │ ├── num-traits v0.2.15 (*) │ │ └── rawpointer v0.2.1 │ ├── num-traits v0.2.15 (*) │ ├── serde v1.0.152 (*) │ ├── utf8_iter v1.0.3 │ └── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) ├── icu_timezone v1.1.0 (/home/manishearth/dev/icu4x/components/timezone) (*) ├── itertools v0.10.5 │ └── either v1.8.1 ├── lazy_static v1.4.0 ├── log v0.4.17 (*) ├── proc-macro2 v1.0.51 (*) ├── quote v1.0.23 (*) ├── rayon v1.6.1 │ ├── either v1.8.1 │ └── rayon-core v1.10.2 │ ├── crossbeam-channel v0.5.6 │ │ ├── cfg-if v1.0.0 │ │ └── crossbeam-utils v0.8.14 │ │ └── cfg-if v1.0.0 │ ├── crossbeam-deque v0.8.2 │ │ ├── cfg-if v1.0.0 │ │ ├── crossbeam-epoch v0.9.13 │ │ │ ├── cfg-if v1.0.0 │ │ │ ├── crossbeam-utils v0.8.14 (*) │ │ │ ├── memoffset v0.7.1 │ │ │ │ [build-dependencies] │ │ │ │ └── autocfg v1.1.0 │ │ │ └── scopeguard v1.1.0 │ │ │ [build-dependencies] │ │ │ └── autocfg v1.1.0 │ │ └── crossbeam-utils v0.8.14 (*) │ ├── crossbeam-utils v0.8.14 (*) │ └── num_cpus v1.15.0 │ └── libc v0.2.126 ├── rust-format v0.3.4 │ └── proc-macro2 v1.0.51 (*) ├── serde v1.0.152 (*) ├── serde-aux v2.3.0 │ ├── chrono v0.4.23 │ │ ├── iana-time-zone v0.1.53 │ │ ├── num-integer v0.1.45 (*) │ │ ├── num-traits v0.2.15 (*) │ │ └── time v0.1.45 │ │ └── libc v0.2.126 │ ├── serde v1.0.152 (*) │ └── serde_json v1.0.93 (*) ├── serde_json v1.0.93 (*) ├── simple_logger v1.16.0 │ └── log v0.4.17 (*) ├── syn v1.0.107 (*) ├── tinystr v0.7.1 (/home/manishearth/dev/icu4x/utils/tinystr) (*) ├── toml v0.5.11 (*) ├── writeable v0.5.1 (/home/manishearth/dev/icu4x/utils/writeable) ├── zerovec v0.9.3 (/home/manishearth/dev/icu4x/utils/zerovec) (*) └── zip v0.5.13 ├── byteorder v1.4.3 ├── bzip2 v0.4.4 │ ├── bzip2-sys v0.1.11+1.0.8 │ │ └── libc v0.2.126 │ │ [build-dependencies] │ │ ├── cc v1.0.79 │ │ └── pkg-config v0.3.26 │ └── libc v0.2.126 ├── crc32fast v1.3.2 │ └── cfg-if v1.0.0 ├── flate2 v1.0.25 │ ├── crc32fast v1.3.2 (*) │ └── miniz_oxide v0.6.2 │ └── adler v1.0.2 ├── thiserror v1.0.38 │ └── thiserror-impl v1.0.38 (proc-macro) │ ├── proc-macro2 v1.0.51 (*) │ ├── quote v1.0.23 (*) │ └── syn v1.0.107 (*) └── time v0.1.45 (*) ```

Some thoughts:

I definitely want to keep clap, for rayon and zip we probably could remove them but I'm not dead set on it since they're realtively common.

robertbastian commented 1 year ago

sha2 is only used for fingerprinting, we could use a lighter hash. We could also move the fingerprinting feature out of datagen into testdatagen.

Manishearth commented 1 year ago

Yep

sffc commented 1 year ago

It looks like the zip features control which shapes of ZIP file we support parsing. Since we control the generation of the zip files we need to parse, we could figure out which features we need for those specific zip files and disable the rest

robertbastian commented 1 year ago

zip is also a transitive dep for cached_path, which we don't use but cannot be disabled.

Manishearth commented 1 year ago

Once #3146 lands I'll probably make a combined PR removing the mandatory cached-path and wasmer deps from datagen, and some other low hanging fruit if I find it.