Open gregsmirnov opened 8 years ago
It's probably risky to require iconv, even for 4.x... I'd be in favour of a shim to be honest, although we'd need to get @silverstripe/core-team to agree before adding any new dependencies to framework.
We are using shim in our code now.
iconv
is a requirement of SS (https://docs.silverstripe.org/en/3.2/getting_started/server_requirements/)
However, I'm fine with a shim too
Oh, if it's already a requirement, then we can simply adjust the server dependencies to "iconv or a substitute, such as tchwork/utf8".
Then we can adjust the behaviour to fail if iconv isn't installed.
However, this raises the question; If you are running a server without the necessary minimum modules installed, then it's not a bug if it behaves incorrectly, so closing. :)
The topic is about UTF-8 string normalisation, that is a part of php-intl
module. SS_Transliterator
class has default configuration to ignore iconv
and use custom character mapping, that in many cases produce better results.
On the other hand, SS_Transliterator character mappings should be improved for extensions. For example for Greek or Cyrillic alphabets, I am forced to use different implementations.
If we're ignoring iconv
for our own implementation and it needs improving, then we should improve it :)
@gregsmirnov what's the needed action here? Can you open a PR?
I'll add test cases and prepare PR.
Thanks so much @gregsmirnov
Ok, I see your point. Thanks for the clarification. ;)
Unfortunately, Silverstripe CMS 3 has entered limited support in June 2018. This means we'll only be fixing critical bugs and security issues for Silverstripe CMS 3 going forward.
You can read the Silverstripe CMS Roadmap for more information on our support commitments.
Can someone confirmed if this is still a problem for SS4?
We encountered a case when A umlaut was encoded as "A\xCC\x88", and
SS_Transliterator
failed to convert it to AE with iconv disabled.This is caused by COMBINING DIAERESIS character.
We fixed the problem by normalising UTF-8 string
Normalize
class is part of phpintl
extension, buttchwork/utf8
shim can be used.