unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.
https://icu4x.unicode.org
Other
1.38k stars 176 forks source link

Investigate performance impact of rearranging "can combine backwards" bit #4967

Open hsivonen opened 5 months ago

hsivonen commented 5 months ago

For characters that are their own decomposition, the least significant bit signifies "can combine backwards". As of Unicode 16, this information is also needed for complex decompositions, but the same bit was already taken, so the second-least-significant bit is used (by #4860).

Investigate the performance impact of flipping around the two bit allocations for complex decompositions and unifying the "can combine backwards" bit check.

sffc commented 5 months ago

Seems like something that would be beneficial to do in 2.0. Anyone can take this and @hsivonen has left enough of a trail. Perhaps @echeran

sffc commented 1 week ago

Estimation of 2.0 status: time to land normalization performance?