jfcherng / php-mb-string

An implementation targeting high performance for frequently reading/writing operations for multi-byte string.
MIT License
10 stars 1 forks source link

Mbstring class error on convEncoding method #3

Open imansa opened 3 weeks ago

imansa commented 3 weeks ago

method return type is string and return value of calling iconv method returned directly but iconv method can send false return sometimes and create a Fatal Error

https://github.com/jfcherng/php-mb-string/blob/8407bfefde47849c9e7c9594e6de2ac85a0f845d/src/MbString.php#L360

need change

sample data

'1H*Ì*1', 'UTF-8', 'UTF-32'

jfcherng commented 3 weeks ago

Does iconv always return false or it only returns false for your sample data?

imansa commented 3 weeks ago

According to php documention iconv Returns the converted string, or false on failure.

in my tests returning false is not for all datas

jfcherng commented 3 weeks ago

mostly likely the input string is out of ASCII region and not utf-8, otherwise I see no reason a utf-8 valid string failed to be converted into utf-32.

jfcherng commented 3 weeks ago

Just to make sure,

$ php -r "var_dump(iconv('UTF-8', 'UTF-32', '1H*Ì*1'));"

shows false on your side?


$ php -r "var_dump(iconv('UTF-8', 'UTF-32', 'Ì'));"

shows false too?

imansa commented 3 weeks ago

i couldn't detected data that result false returning for iconv but this hapend multiple times in production. when iconv can return false i think it's better to handle failure state for using it

jfcherng commented 3 weeks ago

do you have php mbstring extension installed?

jfcherng commented 3 weeks ago

Just to make sure,

$ php -r "var_dump(iconv('UTF-8', 'UTF-32', '1H*Ì*1'));"

shows false on your side?

$ php -r "var_dump(iconv('UTF-8', 'UTF-32', 'Ì'));"

shows false too?

i couldn't detected data that result false returning for iconv

you don't have to. you just have to execute those 2 cli commands and tell me the returned data in the console.

imansa commented 3 weeks ago
bash-4.4$ php -r "var_dump(iconv('UTF-8', 'UTF-32', '1H*Ì*1'));"
string(28) "��1H*�*1"
bash-4.4$ php -r "var_dump(iconv('UTF-8', 'UTF-32', 'Ì'));"
string(8) "���"
imansa commented 3 weeks ago
PHP Version 8.1.29
System  Linux admin 4.18.0-513.24.1.el8_9.x86_64 #1 SMP Thu Apr 4 18:13:02 UTC 2024 x86_64 

mbstring
Multibyte Support   enabled
Multibyte string engine     libmbfl
HTTP input encoding translation     disabled
libmbfl version     1.3.2 
jfcherng commented 3 weeks ago
bash-4.4$ php -r "var_dump(iconv('UTF-8', 'UTF-32', '1H*Ì*1'));"
string(28) "��1H*�*1"
bash-4.4$ php -r "var_dump(iconv('UTF-8', 'UTF-32', 'Ì'));"
string(8) "���"

Hmm... looks perfect to me. So I would assume the issue is that some of your UTF source code files are not UTF-8 encoded.

bash-4.4$ php -r "var_dump(iconv('UTF-8', 'UTF-32', '1HÌ1'));" string(28) "��1H1"

As you can see, your example data can't reproduce the issue from the console.

jfcherng commented 3 weeks ago

You can try to use mbstring by default to see whether it works. Just change the order of these two if blocks: https://github.com/jfcherng/php-mb-string/blob/8407bfefde47849c9e7c9594e6de2ac85a0f845d/src/MbString.php#L347-L358

imansa commented 3 weeks ago

i have no problem to solve this problem but when i am using composer to update dependencies, all changes will be replaced. i created this issue to fix bug for returning false state when using iconv

jfcherng commented 3 weeks ago

Without your experiments, no one knows how to fix it because obviously I am not affected. And you are the first one who is affected since this lib has been created, at least from the issue list here.

jfcherng commented 3 weeks ago

i have no problem to solve this problem

What you have changed on your side?