php / php-src

The PHP Interpreter
https://www.php.net
Other
38.16k stars 7.75k forks source link

php_stream_memory_get_buffer() not zero-terminated #15628

Closed cmb69 closed 1 month ago

cmb69 commented 2 months ago

Description

The documentation of zend_strings states:

It should be noted that while len stores the length without a trailing null byte, the actual string contents in val must always contain a trailing null byte.

According to this, https://github.com/php/pecl-mail-mailparse/commit/36807106310ceeade795a3091dfff3a6a4564c28 has been commited a while ago. However, that caused an issue which had been reported as https://github.com/php/pecl-mail-mailparse/issues/31.

The problem here is that php_stream_memory_get_buffer returns a zend_string which is not zero-terminated. Now, the question is whether that is a bug in php_stream_memory_get_buffer(), or whether zend_strings are always zero-terminated, unless they aren't.

A possible fix for php_stream_memory_get_buffer() might be:

 main/streams/memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/main/streams/memory.c b/main/streams/memory.c
index f53084a6c3a..be8223fc72a 100644
--- a/main/streams/memory.c
+++ b/main/streams/memory.c
@@ -321,7 +321,7 @@ PHPAPI zend_string *_php_stream_memory_get_buffer(php_stream *stream STREAMS_DC)
 {
    php_stream_memory_data *ms = (php_stream_memory_data*)stream->abstract;
    ZEND_ASSERT(ms != NULL);
-   return ms->data;
+   return zend_string_init_fast(ZSTR_VAL(ms->data), ZSTR_LEN(ms->data));
 }
 /* }}} */

If this is not a bug in php_stream_memory_get_buffer(), an obvious fix for mailparse would be to revert https://github.com/php/pecl-mail-mailparse/commit/36807106310ceeade795a3091dfff3a6a4564c28.

PHP Version

PHP 8.1

Operating System

any

nielsdos commented 2 months ago

Afaik zend_string should always be NUL terminated. It would be sad to have to init a new string though...

cmb69 commented 2 months ago

Well, I guess we could also do:

 main/streams/memory.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/main/streams/memory.c b/main/streams/memory.c
index f53084a6c3a..9e90a606d5c 100644
--- a/main/streams/memory.c
+++ b/main/streams/memory.c
@@ -321,6 +321,7 @@ PHPAPI zend_string *_php_stream_memory_get_buffer(php_stream *stream STREAMS_DC)
 {
    php_stream_memory_data *ms = (php_stream_memory_data*)stream->abstract;
    ZEND_ASSERT(ms != NULL);
+   ZSTR_VAL(ms->data)[ZSTR_LEN(ms->data)] = '\0';
    return ms->data;
 }
 /* }}} */

But looking at the implementation of php_stream_memory_write() (which is called by php_stream_putc() among others), I suspect a lot of reallocation happening, so this new string initialization might not make that much of a difference. I wonder whether we should use a smart_str instead for the buffer.

nielsdos commented 2 months ago

I like your last patch better (didn't check whether ZSTR_LEN will go out of bounds though), but I believe it should be something along those lines. To smart_str or not to smart_str is a bit orthogonal, it's hard to say without a benchmark.

cmb69 commented 2 months ago

See

https://github.com/php/php-src/blob/367f303efafdd897aa80c8a9587865fa40f05b04/Zend/zend_string.h#L196-L203

zend_string_alloc() is supposed to allocate at least one byte more than requested. I'll make a PR shortly.

To smart_str or not to smart_str is a bit orthogonal, it's hard to say without a benchmark.

Right!

nielsdos commented 2 months ago

See (...) zend_string_alloc() is supposed to allocate at least one byte more than requested.

Well yes, I know this. But the reason why it is safe is here:

https://github.com/php/php-src/blob/367f303efafdd897aa80c8a9587865fa40f05b04/main/streams/memory.c#L58

because it uses the realloc of zend_string itself instead of something magical custom.

I'll make a PR shortly.

Great! Thanks.