Closed SharkMachine closed 1 year ago
Where can I read about the deprecation? It doesn't appear to be on the docs: https://www.php.net/manual/en/function.utf8-decode.php
I'm generally not so hot on adding a polyfill dep. I'd rather just do a function_exists
test and not bother with support if it isn't installed. This should also prevent the need for test updates.
OK, so this suggests using mb_convert_encoding
, can we use that instead?
We could but that requires ext-mbstring
which is currently not used / required. It's also not bundled in PHP by default - https://www.php.net/manual/en/mbstring.installation.php
It looks like UConverter is bundled by default in php 5.3 or later so that would be a good pick too
ext/intl is bundled since 5.3 but I believe UConverter wasn't added until later: docs indicate 5.5. Also, I think the reference to intl being "bundled" means only that its source is shipped with PHP's (it was previously a PECL extension), not that it's built by default. That's just to say that mbstring is also a "bundled" extension in the same sense that intl is: both are included in the PHP source distribution and both are not built by default. Actually, ext/intl has an external dependency on ICU, whereas PHP directly includes the dependency library for ext/mbstring, so intl is a little "less bundled" than mbstring is.
utf8_encode and _decode are very simple functions since they only support going to/from ISO 8559-1, so you could also feasibly just directly include substitute methods for them... they'd come to something in the tens of lines.
ok if someone wants to copy paste in the polyfill that's cool too
I think the simplest solution is to keep using utf8_*()
functions on PHP 5.2 - 8.1, then try to use UConverter
or some polyfill on PHP 8.2+. From https://www.php.net/manual/en/intl.installation.php it looks to me like UConverter
is bundled and enabled by default in PHP 5.5+
It's also worth noting that the utf8_*()
functions should very rarely be used, http://htmlpurifier.org/live/configdoc/plain.html#Core.Encoding defaults to utf-8
and notably utf8_*()
only run if iso-8859-1
is set and ext-iconv
isn't enabled:
This directive only accepts ISO-8859-1 if iconv is not enabled.
yeah I'm also ok with dumping this support too without iconv
I tried replacing utf8_
with mbstring and iconv, it didn't work. utf8_
works differently
Edit: that change caused unit tests to fail
Looks like this was solved by #326, big thanks to @zerocrates.
Removed usages of deprecated functions
utf8_decode
&utf8_encode
This definitely isn't the way to solve the issue in the long run, but would at least keep backwards compatibility with minimal changes apart from dropping PHP 5.2 support. Removing
utf8_encode
andutf8_decode
completely would require quite a bit of changes to related tests and would also break backwards compatibility.