Closed hyoungki-kim closed 11 months ago
@hyoungki-kim Hi, can you also add the file that is having the problem. I will use it for the unit test.
This is the file. (Language is korean) sbs-das_2023.zip
You can also refer to a comment on php.net. That comment is as follows.
Text-encoding HTML-ENTITIES will be deprecated as of PHP 8.2. To convert all non-ASCII characters into entities (to produce pure 7-bit HTML output), I was using:
echo mb_convert_encoding( htmlspecialchars( $text, ENT_QUOTES, 'UTF-8' ), 'HTML-ENTITIES', 'UTF-8' );
I can get the identical result with:
echo mb_encode_numericentity( htmlentities( $text, ENT_QUOTES, 'UTF-8' ), [0x80, 0x10FFFF, 0, ~0], 'UTF-8' );
The output contains well-known named entities for some often used characters and numeric entities for the rest.
But...our $file_content is not HTML. So. This code is correct and works well.
$file_content = mb_encode_numericentity( $file_content, [0x80, 0x10FFFF, 0, ~0], 'UTF-8' );
Thank you. Updated the code and released a new package version.
Hello.
The unicode character broken, when load smi. This is the code.
Insert this code please. Before loadHTML. like this...