Open bezin opened 1 week ago
cc @youkidearitai
I can confirm the issue under Linux:
php -r "echo tempnam(__DIR__, 'this_string_pushes_the_multi_byte_character_in_position_______ä_foobar');"
…/this_string_pushes_the_multi_byte_character_in_position_______�0e3XkX
Windows does not seem to be affected (note: under Windows it only uses the first 3 characters of the prefix):
php -r "echo tempnam(__DIR__, 'abä');"
…\abä8B7D.tmp
Hi, @bezin !
Thank you for report.
This is filesystem problem. macOS filesystem is HFS+. HFS+ is normalized in NFD.
For example, ä
is \uc2a4
in UTF-8. However, NKD normalized ä
is a\ucc88
.
Based on this, below code is success and not displays error.
<?php
$prefix = 'this_string_pushes_the_multi_byte_character_in_position_______' . 'a\ucc88' . 'foobar';
var_dump(tempnam(__DIR__, $prefix));
If you want to normalize in PHP, please use Normalizer::normalize
and normalizer_normalize
.
<?php
$prefix = 'this_string_pushes_the_multi_byte_character_in_position_______' . normalizer_normalize("ä", Normalizer::NFD) . 'foobar';
var_dump(tempnam(__DIR__, $prefix));
reference:
@youkidearitai the issue is not the filesystem itself. The issue is that tempnam()
cuts off multi-byte characters, resulting in an invalid character in the prefix (see examples above). The issue with the filesystem is just an error after the fact.
@fritzmg I see. I tried reading the source code, but maybe it doesn't support multibyte. I feel like I need more understanding.
Dear @youkidearitai, thanks for taking the time to look into this!
I tested you proposed solution and it does indeed work with a normalized ä
, however it does not work even with a normalized ß
, leading to the same unexpected behaviour as stated above:
> php -r "var_dump(tempnam(__DIR__, 'this_string_pushes_th
e_multi_byte_character_in_position_______' . normalizer_normalize('ß', Normalizer::NFD) . 'foobar'));"
PHP Notice: tempnam(): file created in the system's temporary directory in Command line code on line 1
Notice: tempnam(): file created in the system's temporary directory in Command line code on line 1
bool(false)
I second @fritzmg here, as I would expect the function to correctly handle multi-byte strings internally.
EDIT:
Adding insights by @gmpf:
When you use a normalized ä
but remove one of the underscores, moving the ä
to position 62, you run into the same issue:
> php -r "var_dump(tempnam(__DIR__, 'this_string_pushes_th
e_multi_byte_character_in_position______' . normalizer_normalize('ä', Normalizer::NFD) . 'foobar'));"
PHP Notice: tempnam(): file created in the system's temporary directory in Command line code on line 1
Notice: tempnam(): file created in the system's temporary directory in Command line code on line 1
bool(false)
This is probably due to the fact that the normalization converts ä
into a
and the diaresis, which itself is a multi-byte character. Hence, character 63 in the string is again a multi-byte character:
php -r "var_dump(str_split(normalizer_normalize('ä', Nor
malizer::NFD)));"
array(3) {
[0]=>
string(1) "a"
[1]=>
string(1) "�"
[2]=>
string(1) "�"
}
@fritzmg I see. I tried reading the source code, but maybe it doesn't support multibyte. I feel like I need more understanding.
$prefix
of tempnam()
intentionally not supporting multi-byte strings would still be fine - but it would need to be documented at least here.
However, I would argue that tempnam()
should take multi-byte strings into account.
While php_basename()
caters to multibyte characters, the following code does not. Replacing that code with
p = php_basename(prefix, MIN(prefix_len, 63), NULL, 0);
might work, if there wasn't
Description
Hi all,
first of all thanks for all the development on PHP. Love the language more than ever. In a recent project I encountered this bug (?):
tempnam()
shortens the given prefix after character 63. When passing a multi-byte character in this position, the shortened prefix contains a malformed character. MacOS does not seem to be able to handle that, hence no file is created and a notice is triggered. The PHP docs do not state, that multi-byte strings are not allowed withtempnam()
.To reproduce
test.php
in this folder and paste the following code into it:php test.php
in that directory.Desired behaviour
It should create a temporary file in this folder and
tempnam()
should return the path to the file:Actual behaviour
On MacOS (tested with Sonoma 14.6.1 and Sequoia 15.1) no temporary file is created, a notice is triggered and
tempnam()
returnsfalse
:When you change the
ä
in position 63 for anything but a multi-byte character, everything works as expectedIf I can provide any further information, please let me know.
PHP Version
PHP 8.3.13
Operating System
MacOS Sonoma 14.6.1