unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.
https://icu4x.unicode.org
Other
1.34k stars 174 forks source link

GraphemeClusterBreakSegmenter should be GraphemeClusterSegmenter #2925

Closed sffc closed 1 year ago

sffc commented 1 year ago

The type for grapheme cluster segmentation is called GraphemeClusterBreakSegmenter. I think it should be called GraphemeClusterSegmenter.

In general, the names of types should have the following pattern:

GraphemeCluster- Word- Sentence- Line-
-Segmenter GraphemeClusterSegmenter WordSegmenter ... ...
-BreakIteratorUtf8 GraphemeClusterBreakIteratorUtf8 WordBreakIteratorUtf8 ... ...
... ... ... ... ...

CC @makotokato @aethanyc

aethanyc commented 1 year ago

The rename is done in #2707. We currently use GraphemeClusterSegmenter in

https://github.com/unicode-org/icu4x/blob/4c06df74ba8058870625c0a8c021da05c0ba63f5/experimental/segmenter/src/grapheme.rs#L68-L70

@sffc Did I miss anything?

sffc commented 1 year ago

Hmm

https://github.com/unicode-org/icu4x/pull/2924/files#diff-7ada630f6751436efb40d05798741d43d34423b83f8ba38d8a3cfc034e41dce7R9

sffc commented 1 year ago

Was this a recent change, after 0.7? My example is using icu@1.0.0 with the experimental feature, enabling icu_segmenter 0.7.0. That seems like it is the issue.