tchwork / utf8

Portable and performant UTF-8, Unicode and Grapheme Clusters for PHP
Apache License 2.0
627 stars 50 forks source link

Default normalization is NFD, not NFC as stated (on PHP 7.4) #75

Closed bafetk closed 3 years ago

bafetk commented 4 years ago

Using PHP 7.4, Patchwork/utf8 claims NFC as the default normalization. However, it is hardcoded to use "4" as the normalization. This is actually NFD.

Here are the constant values in PHP 7.4: Normalizer::FORM_C => 16 Normalizer::FORM_D => 4 Normalizer::FORM_KC => 32 Normalizer::FORM_KD => 8 Normalizer::NONE => 2

nicolas-grekas commented 4 years ago

Good catch, would you mind sending a PR to fix this?

nicolas-grekas commented 3 years ago

Actually, the default is still NFC, because we don't use the value of native constants. I'm still singing values with latest PHP versions. Thanks for the report.