Closed robertbastian closed 1 week ago
Actually, data "generation" happens in the provider crate, so maybe the driver crate should be icu_export
(with ExportDriver
), and the provider crate icu_provider_datagen
(icu_datagen
if we don't want to retire the crate name, however all our crates that define providers start with icu_provider_
)
The most modular setup would probably be
icu_datagen_transform
defines DatagenProvidericu_datagen_driver
defines DatagenDrivericu_datagen
defines the icu4x-datagen
binaryThat's my proposal, just with different names.
We could also move icu_provider::datagen
into the driver crate.
Latest proposal:
DatagenDriver
, icu_provider::datagen::{ExportMarker, ExportableDataProvider, ...}
, the registry (make_exportable_provider!
), BakedExporter
icu_export
DatagenDriver
-> ExportDriver
DatagenProvider
icu_provider_datagen
icu_datagen
, icu_provider_source
icu4x-datagen
icu4x-datagen
DatagenProvider
icu4x-reexport
, icu4x-data-transform
icu4x-datagen
instead ICU4X-WG discussion:
ExportMarker
and things that are datagen-specific.icu_export
, it's not clear that it a prover crate and not a component crate. It looks like a component called "export".icu_provider_export
.Macro structure brainstorm:
macro_rules! make_exportable {
([$($marker:path,)*], [$($experimental_marker,)*]) => {
#[cfg(feature = "experimental_components")]
icu_provider::make_exportable_provider!([$($marker,)* $($experimental_marker,)*]);
#[cfg(not(feature = "experimental_components"))]
icu_provider::make_exportable_provider!([$($marker,)*]);
}
}
// uses call-site Cargo features
registry!(make_exportable);
macro_rules cb {
($($marker:path),*)) => {
fn all_keys() -> ... {
HashSet::from_iter([
$($marker),* $<marker>::KEY.path()
])
}
}
}
icu
crate
all_stable_keys()
#[cfg(feature = "experimental")] all_experimental_keys()
#[cfg(feature = "experimental")] all_keys()
(maybe)key(str)
icu_provider
icu_provider::datagen::*
icu_datagen
ExportDriver
(needs rayon, fallback)baked_exporter
module (feature-gated)icu_provider_blob::export
(feature-gated)icu_provider_fs::export
(feature-gated)icu_provider_source
SourceDataProvider
icu
(registry) and icu_provider
(for _::datagen::*
) to implement icu_provider::datagen::ExportableProvider
icu4x-datagen
icu_datagen
, icu_provider_source
, icu
(all_stable_keys, all_experimental_keys, key)icu_provider_source
's cargo featuresuse_wasm = ["icu_provider_source?/use_wasm"]
icu_provider_source
feature, allows blob inputs (current icu_datagen_dart
)Shane's version:
icu
metacrate as aboveicu_provider_transform
icu_datagen
icu_provider_transform
icu4x-datagen
binaryConclusion:
icu_datagen
will have stuff pulled out from it:
DatagenProvider
goes behind a feature. The feature impacts binary behavior. We could fail to install the binary if the feature combinations are incompatible via a feature-gated compile error.registry!
macro as designed above; keep in icu_datagen
DatagenProvider
gets pulled out into its own crate (names to be bikeshed) and icu_datagen
does not depend on itregistry!
macro to icu
metacrateicu4x-datagen
gets pulled out into a binary-only crate called icu4x-datagen
LGTM: @robertbastian @sffc
Discussion:
icu_provider
, to distinguish them from componentsicu_provider_export = { version = "~1.5.0", path = "provider/export" }
# DatagenDriver -> ExportDriver
icu_provider = { version = "~1.5.0", path = "provider/core" }
icu_provider_macros = { version = "~1.5.0", path = "provider/core/macros" }
icu_provider_adapters = { version = "~1.5.0", path = "provider/adapters" }
icu_provider_baked = { version = "~1.5.0", path = "provider/baked" }
icu_provider_blob = { version = "~1.5.0", path = "provider/blob" }
icu_provider_fs = { version = "~1.5.0", path = "provider/fs" }
icu_provider_source = { version = "~1.5.0", path = "provider/source" }
# DatagenProvider -> SourceDataProvider
icu_provider_registry = { version = "~1.5.0", path = "provider/registry" }
icu_provider
prefixicu_provider_baked/export
if easy with new pathsLGTM: @sffc @robertbastian
These two components are fairly independent, and there is now a use case for the datagen driver that doesn't use the CLDR/ICU backed provider (
icu_datagen_dart
). It would be nice to be able to build it without pulling in all the logic and dependencies for parsing CLDR.Idea:
icu_datagen
-DatagenDriver
and optionsicu_provider_source
-DatagenProvider
SourceDataProvider
zip
,wasm
,ureq
etc.)icu4x-datagen
- The CLI, which depends on both cratescargo install icu_datagen
installs a binary calledicu4x-datagen
is confusingbin
feature)icu_datagen_dart
depends onicu_datagen
andicu_provider_blob
, making it a lot more lightweighticu_datagen_dart
approach, i.e. have one universal blob generated from sources (long build time and long gen time, but shared), and then filter that down for use cases (short build time and short gen time).