krakjoe / uopz

User Operations for Zend
Other
356 stars 47 forks source link

PHP 8.3: uopz_set_static() may cause segfaults #178

Open cmb69 opened 1 month ago

cmb69 commented 1 month ago

Consider the following code:

const FOO = "a";

function foo() {
    static $a = FOO;
}

uopz_set_static("foo", ["a" => 42]);
var_dump(uopz_get_static("foo")["a"]);

foo();

Setting the static variable to 42 works fine, as the following var_dump() shows, but calling foo() triggers an assertion error in a debug build, and likely a segfault in release builds, as of PHP 8.3.0. With previous PHP versions the code works as expected.

cmb69 commented 1 month ago

First I though maybe we can hack around this by setting the IS_STATIC_VAR_UNINITIALIZED flag. And that gives the desired output for the script above.

patch ````diff src/function.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/function.c b/src/function.c index bde335d..e988792 100644 --- a/src/function.c +++ b/src/function.c @@ -318,6 +318,7 @@ zend_bool uopz_set_static(zend_class_entry *clazz, zend_string *function, zval * ZEND_HASH_FOREACH_STR_KEY_VAL(variables, k, v) { zval *y; + int unitialized; if (Z_REFCOUNTED_P(v)) { zval_ptr_dtor(v); @@ -330,7 +331,11 @@ zend_bool uopz_set_static(zend_class_entry *clazz, zend_string *function, zval * continue; } + unitialized = Z_TYPE_EXTRA_P(v) & IS_STATIC_VAR_UNINITIALIZED; ZVAL_COPY(v, y); + if (unitialized) { + Z_TYPE_EXTRA_P(v) |= IS_STATIC_VAR_UNINITIALIZED; + } } ZEND_HASH_FOREACH_END(); return 1; ````

But better be sure that everything works:

const FOO = "a";

function foo() {
    static $a = FOO;
    return $a;
}

uopz_set_static("foo", ["a" => 42]);
var_dump(uopz_get_static("foo")["a"]);

var_dump(foo());

outputs:

int(42)
string(1) "a"

Okay, what's going on (opcache.opt_debug_level=0x10000):

foo:
     ; (lines=5, args=0, vars=1, tmps=1)
     ; (before optimizer)
     ; C:\php-sdk\phpdev\vs17\x64\static.php:5-8
     ; return  [] RANGE[0..0]
0000 BIND_INIT_STATIC_OR_JMP CV0($a) 0003
0001 T1 = FETCH_CONSTANT string("FOO")
0002 BIND_STATIC (ref) CV0($a) T1
0003 RETURN CV0($a)
0004 RETURN null

So it seems to me, that even if we would install our own BIND_INIT_STATIC_OR_JMP handler, we couldn't fix uopz_set_static() for the general case. @iluuu1994, thoughts?

iluuu1994 commented 1 month ago

but calling foo() triggers an assertion error in a debug build

Can you clarify which assertion? The one I see is ZEND_ASSERT(Z_TYPE_P(value) == IS_REFERENCE); in the !(Z_TYPE_EXTRA_P(value) & IS_STATIC_VAR_UNINITIALIZED) condition. This would indicate that the static variables array contains something that is not a reference, which (apart from NULL with the IS_STATIC_VAR_UNINITIALIZED flag) is not expected.

That said, I'm not sure why I didn't consider just checking for IS_NULL, given that a user-set null value would be embedded in a reference. This patch seems to work too. Maybe that would solve your problem as well.

diff --git a/Zend/zend_compile.c b/Zend/zend_compile.c
index 9253cb9708..307bcab819 100644
--- a/Zend/zend_compile.c
+++ b/Zend/zend_compile.c
@@ -5483,7 +5483,6 @@ static void zend_compile_static_var(zend_ast *ast) /* {{{ */
        zend_op *opline;

        zval *placeholder_ptr = zend_hash_update(CG(active_op_array)->static_variables, var_name, &EG(uninitialized_zval));
-       Z_TYPE_EXTRA_P(placeholder_ptr) |= IS_STATIC_VAR_UNINITIALIZED;
        uint32_t placeholder_offset = (uint32_t)((char*)placeholder_ptr - (char*)CG(active_op_array)->static_variables->arData);

        uint32_t static_def_jmp_opnum = get_next_op_number();
diff --git a/Zend/zend_types.h b/Zend/zend_types.h
index d1db25a8f2..d33f8a33bc 100644
--- a/Zend/zend_types.h
+++ b/Zend/zend_types.h
@@ -791,11 +791,6 @@ static zend_always_inline uint32_t zval_gc_info(uint32_t gc_type_info) {
 /* zval.u1.v.type_flags */
 #define IS_TYPE_REFCOUNTED         (1<<0)
 #define IS_TYPE_COLLECTABLE            (1<<1)
-/* Used for static variables to check if they have been initialized. We can't use IS_UNDEF because
- * we can't store IS_UNDEF zvals in the static_variables HashTable. This needs to live in type_info
- * so that the ZEND_ASSIGN overrides it but is moved to extra to avoid breaking the Z_REFCOUNTED()
- * optimization that only checks for Z_TYPE_FLAGS() without `& (IS_TYPE_COLLECTABLE|IS_TYPE_REFCOUNTED)`. */
-#define IS_STATIC_VAR_UNINITIALIZED        (1<<0)

 #if 1
 /* This optimized version assumes that we have a single "type_flag" */
diff --git a/Zend/zend_vm_def.h b/Zend/zend_vm_def.h
index 77c1487b65..7c3a22ddb1 100644
--- a/Zend/zend_vm_def.h
+++ b/Zend/zend_vm_def.h
@@ -9110,7 +9110,7 @@ ZEND_VM_HANDLER(203, ZEND_BIND_INIT_STATIC_OR_JMP, CV, JMP_ADDR)
    ZEND_ASSERT(GC_REFCOUNT(ht) == 1);

    value = (zval*)((char*)ht->arData + opline->extended_value);
-   if (Z_TYPE_EXTRA_P(value) & IS_STATIC_VAR_UNINITIALIZED) {
+   if (Z_TYPE_P(value) == IS_NULL) {
        ZEND_VM_NEXT_OPCODE();
    } else {
        SAVE_OPLINE();
diff --git a/Zend/zend_vm_execute.h b/Zend/zend_vm_execute.h
index b32ef2a6cc..f234bb149a 100644
Binary files a/Zend/zend_vm_execute.h and b/Zend/zend_vm_execute.h differ
iluuu1994 commented 1 month ago

I guess the primary difference is that ZEND_BIND_STATIC allows non-reference values in the static variables array, while ZEND_BIND_INIT_STATIC_OR_JMP does not. I think it's reasonable to expect uopz to wrap the set values in references, which is seems like it doesn't. It even seems like it detaches existing references, which can have side effects if the reference escapes the function.

cmb69 commented 1 month ago

Thanks for the quick reply!

I think it's reasonable to expect uopz to wrap the set values in references, which is seems like it doesn't.

Indeed, it does not. Maybe this is actually the root cause of the problem. I'll check that out.

That said, I'm not sure why I didn't consider just checking for IS_NULL, given that a user-set null value would be embedded in a reference. This patch seems to work too. Maybe that would solve your problem as well.

I don't think that this would help, but I'll have a closer look.

iluuu1994 commented 1 month ago

I don't think that this would help, but I'll have a closer look.

Probably not, but it looks like a nice simplification nonetheless.

cmb69 commented 1 month ago

It even seems like it detaches existing references, which can have side effects if the reference escapes the function.

You are referring to code like

function &foo() {
    static $a = 1;
    return $a;
}

$a = &foo();
// uopz_set_static("foo", ["a" => 3]); // outputs 3 when uncommented
$a = 2;
var_dump(foo()); // outputs 2

That doesn't look right to me, but apparently nobody had complained about this (although the code is implented this way for years), so I'm somewhat reluctant to change the behavior.

iluuu1994 commented 1 month ago

Yes, that's what I was referring to. uopz_set_static() will detch the reference and the write to $a will no longer influence the static variable. But I don't care about uopz's behavior, so feel free to treat as a wontfix.