unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.
https://icu4x.unicode.org
Other
1.34k stars 174 forks source link

Data key thoughts #4991

Open robertbastian opened 3 months ago

robertbastian commented 3 months ago

Current architecture:

pub trait DataMarker { ... }
pub trait DynamicDataProvider<M: DataMarker> { load(&self, DataKey, DataRequest) }

pub trait KeyedDataMarker: DataMarker { const KEY: DataKey; }
pub trait DataProvider<M: KeyedDataMarker> { load(&self, DataRequest) }

Proposal 1: Align data marker names with data provider names

pub trait DynamicDataMarker { ... }
pub trait DynamicDataProvider<M: DynamicDataMarker> { load(&self, DataKey, DataRequest) }

pub trait DataMarker: DynamicDataMarker { const KEY: DataKey; }
pub trait DataProvider<M: DataMarker> { load(&self, DataRequest) }

Proposal 2: Get rid of "data key" terminology

pub trait DynamicDataMarker { ... }
pub trait DynamicDataProvider<M: DynamicDataMarker> { load(&self, DataMarkerInfo, DataRequest) }

pub trait DataMarker: DynamicDataMarker { const INFO: DataMarkerInfo; }
pub trait DataProvider<M: DataMarker> { load(&self, DataRequest) }

Proposal 3: DataMarkerInfo.path becomes DataMarkerInfo.name

Proposal 4: Split DataMarkerInfo's fields (maybe)

pub trait DataMarker: DynamicDataMarker { 
  const NAME: DataMarkerName;
  const IS_SINGLETON: bool;
  const FALLBACK_CONFIG: LocaleFallbackConfig;
}
sffc commented 3 months ago

Initial thoughts:

  1. OK with Proposal 1
  2. This is a somewhat radical change to the mental model, but I can buy into it because I've never been a huge fan of the term "key", and you're right that "data marker" and "data key" refer to very similar concepts
  3. I agree with the problems you laid out, but I also feel that the directory-like structure has been useful to us. I think if we did Proposal 3 we should do the variant where we use the fully qualified marker name, maybe eliding the "provider" since it will be the same for all markers. icu_list/AndListV1Marker
  4. I think I do prefer inlining the fields to the DataMarker trait directly.
sffc commented 3 months ago

We could have both inlined const fields and a helper DataMarkerInfo

pub trait DataMarker: DynamicDataMarker { 
  const NAME: DataMarkerName;
  const IS_SINGLETON: bool;
  const FALLBACK_CONFIG: LocaleFallbackConfig;
}

#[non_exhaustive]
pub struct DataMarkerInfo {
  pub name: DataMarkerName;
  pub is_singleton: bool;
  pub fallback_config: LocaleFallbackConfig;
}

impl DataMarkerInfo {
  pub fn from_marker<M: DataMarker>() -> Self { /* ... */ }
}

pub trait DynamicDataProvider<M: DynamicDataMarker> { load(&self, DataMarkerInfo, DataRequest) }

But I guess if you do that you might as well just make DataMarkerInfo a const field, as in Proposal 3.

sffc commented 3 months ago

Summary of additional discussion later:

Manishearth commented 3 months ago

But the "Marker" name is overloaded

to expand on this a little bit (but not to rabbithole too deep because I don't care about "Marker" strongly): I find Marker to be a good default, but it's nice when it can be replaced with a useful concept name instead. I think we may have options here.


one thing I proposed in that discussion that isn't included here was encoding the heirarchy in the trait, so you can have something like:

impl DataMarker for DaySymbolsV1Marker {
   const COMPONENT = "datetime";
   const SUBCOMPONENT = Some("symbols");
}
sffc commented 2 months ago

Notes from the scratch pad:

pub fn try_new_unstable<D>(
    provider: &D,
    locale: &DataLocale,
    options: CollatorOptions
) -> Result<Self, DataError>
where
    D: DataProvider<CollationSpecialPrimariesV1Marker> + DataProvider<CollationDataV1Marker> + DataProvider<CollationDiacriticsV1Marker> + DataProvider<CollationJamoV1Marker> + DataProvider<CollationMetadataV1Marker> + DataProvider<CollationReorderingV1Marker> + DataProvider<CanonicalDecompositionDataV1Marker> + DataProvider<CanonicalDecompositionTablesV1Marker> + ?Sized,

pub fn try_new_unstable<D>(
    provider: &D,
    locale: &DataLocale,
    options: CollatorOptions
) -> Result<Self, DataError>
where
    D: DataProvider<Collation_SpecialPrimaries_V1> + DataProvider<Collation_Data_V1> + DataProvider<Collation_Diacritics_V1> + DataProvider<Collation_Jamo_V1> + DataProvider<Collation_Metadata_V1> + DataProvider<Collation_Reordering_V1> + DataProvider<Collator_CanonicalDecomposition_Data_V1> + DataProvider<Collation_CanonicalDecomposition_Tables_V1> + ?Sized,

pub fn try_new_unstable<D>(
    provider: &D,
    locale: &DataLocale,
    options: FixedDecimalFormatterOptions
) -> Result<Self, DataError>
where
    D: DataProvider<DecimalSymbolsV1Marker> + ?Sized

pub fn try_new_unstable<D>(
    provider: &D,
    locale: &DataLocale,
    options: FixedDecimalFormatterOptions
) -> Result<Self, DataError>
where
    D: DataProvider<Decimal_Symbols_V1> + ?Sized

Current: DangiDatePatternV1Marker + “datetime/patterns/dangi/date@1” Shane Proposes: DateTime_Patterns_Date_Dangi_V1 Rob Proposes: DatetimePatternsDateDangiV1Marker, icu::datetime::provider::patterns::date::DangiV1Marker

where: P: DataProvider<patterns::date::DangiV1Marker> + DataProvider<patterns::date::GregorianV1Marker> ...

where: P: DataProvider<DateTime_Patterns_Date_Dangi_V1> + DataProvider<DateTime_Patterns_Date_Gregorian_V1> ...

icu::datetime::provider::patterns::date::DangiV1Marker

datetime/patterns/date/DangiV1Marker/en.json

datetime/patterns/date/dangi@1/en.json

sffc commented 5 days ago

Discussion:

Some proposals:

More discussion:

Conclusion:

LGTM: @sffc @robertbastian @Manishearth