unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.
https://icu4x.unicode.org
Other
1.39k stars 179 forks source link

Globaldata: bikeshed feature and constructor names #3536

Closed sffc closed 10 months ago

sffc commented 1 year ago

What should the feature be called that enables globaldata?

What name should we use for the constructors?

We had discussed in https://github.com/unicode-org/icu4x/issues/2945#issuecomment-1583147339 that the constructor names should clearly indicate the provenance of the data, but @robertbastian raised some concerns in https://github.com/unicode-org/icu4x/issues/2945#issuecomment-1591011051 that we should discuss further.

Discuss with:

Optional:

robertbastian commented 1 year ago

My order of preference for feature names:

  1. compile_data: I think this is most accurate, it compiles the data into the binary. I doesn't imply where the data comes from, which is correct as the feature also allows to compile client-provided data into the binary. In documentation I'd like to call the provider-less constructor the "compile-time-data constructor", and the unstable, buffer, and any constructors the "runtime-data constructors", which would align nicely with this name.
  2. compiled_data: Sounds a bit more like we already compiled that data in, but the user can actually do this themselves.
  3. data: Most concise. I don't think this is ambiguous on the component crates; on infrastructure crates like icu_provider it could be, but on a component it can only really mean one thing: enable the data.
  4. baked_data: We've been using this term already (although not consistently, the datagen mode is called --format=mod, the testdata provider unstable). I think "baked" is a good description for the data format, but less fitting for the constructors and the concept of compiling in data.
  5. auto_data: I don't think the data being automatically included is the best way to describe this. Clients can still manually set data at compile time, and it's not doing anything fancy like static analysis automatically.
  6. builtin_data: not accurate, as it's possible to use user-generated baked data via an env variable, so no data built into the crate will be used.
  7. globaldata: overloaded with the visibility descriptor in programming, and the data isn't necessarily global, as the user can supply their own (and CLDR doesn't have 100% coverage anyway).
robertbastian commented 1 year ago

We didn't record a conclusion on whether to make this a default feature in https://github.com/unicode-org/icu4x/issues/2945#issuecomment-1583147339, but iirc there wasn't any opposition to this proposal.

sffc commented 1 year ago
  1. Feature compiled_data
  2. Plain constructor names
  3. The constructor docs should say "with compiled data" in the first sentence, which should link to the constructor docs. Still keep the books emoji at the bottom.
  4. The feature is enabled by default and can be enabled/disabled with plumbing through the metacrate

LGTM: @robertbastian @zbraniecki @skius @eggrobin @Manishearth (@sffc)

robertbastian commented 1 year ago

Outstanding tasks: