Masterminds / html5-php

An HTML5 parser and serializer for PHP.
http://masterminds.github.io/html5-php/
Other
1.55k stars 114 forks source link

ext-mbstring should be required #203

Open alecpl opened 3 years ago

alecpl commented 3 years ago

Here's why:

  1. Masterminds\HTML5\Parser\CharacterReference::lookupDecimal() uses mb_decode_numericentity() unconditionally.
  2. Looking at Masterminds\HTML5\Parser\UTF8Utils::convertToUTF8() either iconv or mbstring must be available (if the input encoding is not 'auto').

This would allow to:

  1. Get rid of iconv() use. In my experience mbstring is really a better solution.
  2. Remove use of utf8_decode() which is not really valid and not needed when mbstring is available.
  3. Get rid of the fallback code.
alecpl commented 2 years ago

Actually utf8_decode() is deprecated in PHP 8.2, and will be removed later. So, this is more like a bug now.

goetas commented 1 year ago

sorry for the late reply. makes sense what you are suggesting. would be happy to see a PR