php / php-src

The PHP Interpreter
https://www.php.net
Other
38.17k stars 7.75k forks source link

Significant performance degradation in 'foreach' starting from PHP 8.2.13 (caused by garbage collection) #13193

Closed sboikov1983 closed 9 months ago

sboikov1983 commented 9 months ago

Description

We noticed significant performance degradation in our application after migration to PHP 8.2.13 (and PHP 8.3.0 has the same issue). Our application performs a lot of computations and in some cases execution time increased from 2 to 6 hours.

Here is a simple script demonstrating the problem (run as 'php -dmemory_limit=-1 perf_test.php'):

<?php
class PerfTest {
    /*
     Create 'object' with many fields:
    {
        "obj_0": {"field_0": 1, "field_1": 2, ...},
        "obj_1": {"field_0": 1, "field_1": 2, ...},
        ...
    }
     */
    private static function generateObject(): array {
        $resultObj = [];
        for ($i = 0; $i < 100; $i++) {
            $nestedObj = [];
            for ($j = 0; $j < 20; $j++) {
                $nestedObj["field_$j"] = 0;
            }
            $resultObj["obj_$i"] = $nestedObj;
        }
        return $resultObj;
    }

    public function run(): void {
        $objects = [];
        for ($i = 0; $i < 50000; $i++) {
            $objects [] = self::generateObject();
        }

        $start0 = microtime(true);
        foreach ($objects as $object) {
            foreach ($object as $nestedObj) {
                // No-op, just iterate generated objects.
            }
        }
        echo "Iteration time: " . round((microtime(true) - $start0), 1) . " sec\n";
    }
}
(new PerfTest())->run();

Here is timing for PHP 8.2.12 and PHP 8.2.13 when GC is enabled and disabled:

PHP 8.2.12 -dzend.enable_gc=1: Iteration time: 0.8 sec
PHP 8.2.12 -dzend.enable_gc=0: Iteration time: 0.1 sec
PHP 8.2.13 -dzend.enable_gc=1: Iteration time: 8 sec
PHP 8.2.13 -dzend.enable_gc=0: Iteration time: 0.1 sec

For 8.2.12 enabling GC doesn't have major impact on performance, but for 8.2.13 enabling GC increases time from 0.1 sec to 8 seconds (the same for 8.3.0). This is just a simple script to demonstrate the issue, in real application impact on performance is dramatic.

As I understand reason of additional GC overhead is this commit. I understand that it fixes potential memory leak, but perphaps there is a way to optimize it, so it won't cause such dramatic performance degradation?

I also understand that it's possible to selectively disable/enable GC in the application, or somehow get rid of 'foreach' loops. But since our application has thousands of such loops, changing the code would be very hard (and potentially the same issue can be in the 3rd party libraries which we can't change).

PHP Version

PHP 8.3

Operating System

Ubuntu 20.04, Windows 11

SVGAnimate commented 9 months ago

I confirm this drop in performance under Linux.

0.3 sec to 1.1s


[!WARNING]
I notice that it seems extremely difficult to understand the usefulness of this test.

$array = [0, 1];
foreach($array as &$v) {
    $array[0] =& $array;

    $b = 1;
    $array =& $b;

    gc_collect_cycles();
    break;
}
var_dump(gc_collect_cycles());

[!TIP] If we create a reference above the loop, the memory leak no longer exists

$a = [0, 1];
$array = &$a;
foreach($array as &$v) {
// 
}
SVGAnimate commented 9 months ago

Yop

@dstogov I looked this report to get my hands of php. I share my essays with you. Hoping to get your feedback and your advice.

As I said above, I tried to create a temporary reference to feed the "foreach" loop. I have check that there was no memory leak with the "*.phpt" #12575 by deleting || kind == ZEND_LIVE_LOOP .

I was able to see from the @sboikov1983 test that from 1.1s I fell back to 0.3s

#if 1
    znode tmp_result;
    opnum_reset = get_next_op_number();
    zend_emit_op(&tmp_result, ZEND_MAKE_REF, &expr_node, NULL);

    opnum_reset = get_next_op_number();
    opline = zend_emit_op(&reset_node, by_ref ? ZEND_FE_RESET_RW : ZEND_FE_RESET_R, &tmp_result, NULL);
#else
    opnum_reset = get_next_op_number();
    opline = zend_emit_op(&reset_node, by_ref ? ZEND_FE_RESET_RW : ZEND_FE_RESET_R, &expr_node, NULL);
#endif

Not having a global vision of how PHP works. I don't know what this modification is worth. Is this the right way to do it?

dstogov commented 9 months ago

The degradation occurs because https://github.com/php/php-src/commit/abe3673d1fd09a35c25b4da6248f7e8c106aa37e adds a big array into GC buffer, and then it's iteratively checked by GC. It doesn't slow-downs foreach but increases amount of GC work, when it's repeatable launched from inside the foreach loop.

The fix looks right, and I don't see a simple way to fix this degradation. I'll think a bit more...

dstogov commented 9 months ago

The issue was closed by mistake

dstogov commented 9 months ago

I thought we may reduce the degradation by limiting effect if the old fix only to references. But this doesn't work. The next example explains this. It starts to fail (leak) when the old fix removed.

<?php
function foo() {
    $a = [];
    $a[0] = 1;
    $a[1] = &$a;
    return $a;
}
foreach(foo() as $v) {                  // foo() returns a circular array
    var_dump(gc_collect_cycles());  // GC can't free the array because it's still iterated
    break;
}
var_dump(gc_collect_cycles());          // GC is able to detect and free the array only if it was added to garbage candidates buffer after the first GC above (see the old fix)
?>

@iluuu1994 @nielsdos may be you have any ideas?

nielsdos commented 9 months ago

I don't really have a better idea. The only thing I think about is changing the VM instead of GC, like so: https://gist.github.com/nielsdos/80c7f19ab12a4ae6678dc403246435ba this reduces the impact on the performance, it seems on par now with what it used to be.

To be honest I am not very familiar with the GC and its nuances, I did not read much code of it yet. So probably I am missing something why my idea is wrong.

dstogov commented 9 months ago

@nielsdod the current solution adds iterated variable into GC buffer only after GC launched from inside a foreach loop. Your solution will add into GC buffer any variable used in foreach. This may help to foreach over very big arrays (like in this test case), but on the other hand will slowdown apps with many foreach over smaller arrays.

nielsdos commented 9 months ago

Okay I understand, thanks for pointing that out. I was not able to come up with something else.

dstogov commented 9 months ago

@SVGAnimate we can't disable array modification and this won't fix the problem anyway (see my last example above).

The problem that on each foreach iteration we increment/decrement refcounter of the current element (because of assignment to $v) and then store it into GC buffer. This leads to too often calls to GC collector and in case of a big array we get quadratic slowdown.

I think, the problem may be reduced in general, if before the main GC collector algorithm we will remove from GC buffer all elements reachable from the VM roots (global and static variables, CVs and alive temporary variables of stack frames, etc...) We may start only with alive temporary variables...

I'm not sure when I get time to check this idea. @iluuu1994 @nielsdos can you think in this direction?

SVGAnimate commented 9 months ago

I had thought of something completely different to maintain PHP performance.

What do you think about not creating an internal reference(using ZEND_FE_RESET_R instead of ZEND_FE_RESET_RW )? Let me explain. in zend_compile_foreach we could use the $array[$key] variable(opline->op2) as an alias instead of &$value.

This involves adding a context to the zend_compile_stmt(stmt_ast, context_alias)

It can also be used for list() foreach($array as [&$x]) ($x is compiled as an alias of $array[$key][0] instead a zval) This is not documented. I tried to make an RFC for list but all I do is push myself even deeper into the difficulty.

PS: Backward compatibility bugs will not be supported. There is therefore little chance that this will be accepted...

SVGAnimate commented 9 months ago

@sboikov1983 It's normal that PHP is slower, now it fixes memory leaks. We can not have everything. In addition, creating an array of a million records takes 8s compared to 0.3/1.1s to browse it. Is this really the problem? Perhaps an extension allowing you to carry out your heavy calculations?