Closed sboikov1983 closed 9 months ago
I confirm this drop in performance under Linux.
0.3 sec to 1.1s
[!WARNING]
I notice that it seems extremely difficult to understand the usefulness of this test.
$array = [0, 1];
foreach($array as &$v) {
$array[0] =& $array;
$b = 1;
$array =& $b;
gc_collect_cycles();
break;
}
var_dump(gc_collect_cycles());
[!TIP] If we create a reference above the loop, the memory leak no longer exists
$a = [0, 1]; $array = &$a; foreach($array as &$v) { // }
Yop
@dstogov I looked this report to get my hands of php. I share my essays with you. Hoping to get your feedback and your advice.
As I said above, I tried to create a temporary reference to feed the "foreach" loop.
I have check that there was no memory leak with the "*.phpt" #12575 by deleting || kind == ZEND_LIVE_LOOP
.
I was able to see from the @sboikov1983 test that from 1.1s I fell back to 0.3s
#if 1
znode tmp_result;
opnum_reset = get_next_op_number();
zend_emit_op(&tmp_result, ZEND_MAKE_REF, &expr_node, NULL);
opnum_reset = get_next_op_number();
opline = zend_emit_op(&reset_node, by_ref ? ZEND_FE_RESET_RW : ZEND_FE_RESET_R, &tmp_result, NULL);
#else
opnum_reset = get_next_op_number();
opline = zend_emit_op(&reset_node, by_ref ? ZEND_FE_RESET_RW : ZEND_FE_RESET_R, &expr_node, NULL);
#endif
Not having a global vision of how PHP works. I don't know what this modification is worth. Is this the right way to do it?
The degradation occurs because https://github.com/php/php-src/commit/abe3673d1fd09a35c25b4da6248f7e8c106aa37e adds a big array into GC buffer, and then it's iteratively checked by GC.
It doesn't slow-downs foreach
but increases amount of GC work, when it's repeatable launched from inside the foreach
loop.
The fix looks right, and I don't see a simple way to fix this degradation. I'll think a bit more...
The issue was closed by mistake
I thought we may reduce the degradation by limiting effect if the old fix only to references. But this doesn't work. The next example explains this. It starts to fail (leak) when the old fix removed.
<?php
function foo() {
$a = [];
$a[0] = 1;
$a[1] = &$a;
return $a;
}
foreach(foo() as $v) { // foo() returns a circular array
var_dump(gc_collect_cycles()); // GC can't free the array because it's still iterated
break;
}
var_dump(gc_collect_cycles()); // GC is able to detect and free the array only if it was added to garbage candidates buffer after the first GC above (see the old fix)
?>
@iluuu1994 @nielsdos may be you have any ideas?
I don't really have a better idea. The only thing I think about is changing the VM instead of GC, like so: https://gist.github.com/nielsdos/80c7f19ab12a4ae6678dc403246435ba this reduces the impact on the performance, it seems on par now with what it used to be.
To be honest I am not very familiar with the GC and its nuances, I did not read much code of it yet. So probably I am missing something why my idea is wrong.
@nielsdod the current solution adds iterated variable into GC buffer only after GC launched from inside a foreach
loop. Your solution will add into GC buffer any variable used in foreach
. This may help to foreach over very big arrays (like in this test case), but on the other hand will slowdown apps with many foreach
over smaller arrays.
Okay I understand, thanks for pointing that out. I was not able to come up with something else.
@SVGAnimate we can't disable array modification and this won't fix the problem anyway (see my last example above).
The problem that on each foreach
iteration we increment/decrement refcounter of the current element (because of assignment to $v) and then store it into GC buffer. This leads to too often calls to GC collector and in case of a big array we get quadratic slowdown.
I think, the problem may be reduced in general, if before the main GC collector algorithm we will remove from GC buffer all elements reachable from the VM roots (global and static variables, CVs and alive temporary variables of stack frames, etc...) We may start only with alive temporary variables...
I'm not sure when I get time to check this idea. @iluuu1994 @nielsdos can you think in this direction?
I had thought of something completely different to maintain PHP performance.
What do you think about not creating an internal reference(using ZEND_FE_RESET_R instead of ZEND_FE_RESET_RW )?
Let me explain. in zend_compile_foreach we could use the $array[$key] variable(opline->op2
) as an alias instead of &$value.
This involves adding a context to the zend_compile_stmt(stmt_ast, context_alias)
It can also be used for list() foreach($array as [&$x])
($x
is compiled as an alias of $array[$key][0]
instead a zval
PS: Backward compatibility bugs will not be supported. There is therefore little chance that this will be accepted...
@sboikov1983 It's normal that PHP is slower, now it fixes memory leaks. We can not have everything. In addition, creating an array of a million records takes 8s compared to 0.3/1.1s to browse it. Is this really the problem? Perhaps an extension allowing you to carry out your heavy calculations?
Description
We noticed significant performance degradation in our application after migration to PHP 8.2.13 (and PHP 8.3.0 has the same issue). Our application performs a lot of computations and in some cases execution time increased from 2 to 6 hours.
Here is a simple script demonstrating the problem (run as 'php -dmemory_limit=-1 perf_test.php'):
Here is timing for PHP 8.2.12 and PHP 8.2.13 when GC is enabled and disabled:
For 8.2.12 enabling GC doesn't have major impact on performance, but for 8.2.13 enabling GC increases time from 0.1 sec to 8 seconds (the same for 8.3.0). This is just a simple script to demonstrate the issue, in real application impact on performance is dramatic.
As I understand reason of additional GC overhead is this commit. I understand that it fixes potential memory leak, but perphaps there is a way to optimize it, so it won't cause such dramatic performance degradation?
I also understand that it's possible to selectively disable/enable GC in the application, or somehow get rid of 'foreach' loops. But since our application has thousands of such loops, changing the code would be very hard (and potentially the same issue can be in the 3rd party libraries which we can't change).
PHP Version
PHP 8.3
Operating System
Ubuntu 20.04, Windows 11