krakjoe / parallel

A succinct parallel concurrency API for PHP8
Other
1.47k stars 96 forks source link

Closure Reuse Issue in parallel: Cached Closures from Previous Scripts are Reused #309

Open Skypewalker opened 4 months ago

Skypewalker commented 4 months ago

Description

When using the parallel extension, closures from previously executed scripts are being reused in new scripts, even when the closures are expected to be different. This appears to be due to the caching mechanism in php_parallel_cache_closure.

Steps to Reproduce

  1. Create a script that runs a closure using parallel\Runtime.
  2. Create another script that runs a different closure using parallel\Runtime.
  3. Observe that the closure from the first script is reused in the second script.

Script 1:

<?php
    use \parallel\{Runtime};

    $task1 = function($param){
        return "task1:$param";
    };

    $rt = new Runtime();
    $future = $rt->run($task1, array(1));
    $futureValue = $future->value();

    echo($futureValue);
?>

Script 2:

<?php
    use \parallel\{Runtime};

    $task2 = function($param){
        return "task2:$param";
    };

    $rt = new Runtime();
    $future = $rt->run($task2, array(2));
    $futureValue = $future->value();

    echo($futureValue);
?>

Expected Behavior

If you execute script 1 the output should be: task1:1

If you then execute script 2 the output should be: task2:2

Actual Behavior

The closure from the first script is reused in the second script, leading to unexpected behavior. script 1 echoes task1:1 but script 2 echoes task1:2

Environment

Additional Information

After inspecting the source code, it appears that the issue lies in the caching mechanism of the php_parallel_cache_closure function, which uses the opcode array to identify and cache closures. If two closures have identical opcode arrays, as in an example, they are treated as the same closure.

Suggested Fix

Consider adding a mechanism to ensure closures are uniquely identified, possibly by including additional context or metadata in the cache key.

realFlowControl commented 3 months ago

Hey @Skypewalker, thanks for your bug report, I was able to reproduce this behaviour. If you like to help, could you try and see if you are able to replicate this behaviour in a single request or on CLI?

Skypewalker commented 3 months ago

Hey, I didn't manage to replicate this behavior in a single request. I tried several scenarios with two closures with the same OPCode signatures and everything worked fine. Until I requested another script that contains closure with the same OPCode as the first one. So, as far as I noticed this bug happens only between different scripts.

pielonet commented 1 month ago

Hi, I faced the same kind of issue two days ago with two closures in two different scripts with the same number of arguments but different signatures : the first one is expecting a Channel as second argument, the second one is expecting an array (of Channel) as second argument with a different argument name.

In the first script

// Launch controller in a parallel thread
$controller = \parallel\run(
    function(Channel $controller_channel, Channel $queue_channel, array $config): array {
...

In the second script

// Launch controller in a parallel thread
$controller = \parallel\run(
    function(Channel $controller_channel, array $queues_channels, array $config): array {
...

When running the second script after the first one has run I receive a runtime error : TypeError because the second argument passed to the closure does not match the expected type. The error refers to the closure in the first script, although it was running the second script, which I found astounding.

Here is the error message :

Fatal error: Uncaught TypeError: {closure}(): Argument #2 ($queue_channel) must be of type parallel\Channel, array given in /app/tutorial/example10/main.php:24
Stack trace:
#0 [internal function]: {closure}(Object(parallel\Channel), Array, Array)
#1 {main}
  thrown in /app/tutorial/example10/main.php on line 24

The complete code that fails can be found here : https://github.com/pielonet/concurrency/blob/main/tutorial/example42/main.php

I hopefully found this bug report before becoming crazy...

Please consider fixing this soon.

I offer help for testing if needed.

Kind regards, Thierry

realFlowControl commented 1 month ago

Hey there, are you able to replicate the issue with OPcache enabled as well? For CLI that'd require to set opcache.enable_cli = 1.

pielonet commented 1 month ago

Hi, Enabling opcache fixed my issue ! Does it mean opcache should always be enabled for php/parallel to work properly ?

realFlowControl commented 1 month ago

Enabling opcache fixed my issue !

That's good to hear, I'll try to fix it anyway 😉

Does it mean opcache should always be enabled for php/parallel to work properly ?

In some cases it helps, especially when "reusing" code, or in cases like this, where the address in memory could be the same when OPcache is not enabled. I would not go as far as saying ext-parallel depends on OPcache, as it should also work without.

pielonet commented 1 month ago

If I understand well the issue I faced was a bit different from the one first reported here (as I wasn't using opcache). Should I open a new issue for my particular case ?

realFlowControl commented 1 month ago

Nah, I guess it is okay. I assume that @Skypewalker also hit this bug without OPcache enabled.

Skypewalker commented 1 month ago

Hi, I just saw that opchache.ini wasn't loaded at all and Zend OPcache section was missing entirely in phpinfo(). When I included opchache.ini into configuration (which has both caches enabled) everything worked fine. Thank you very much for your effort.

pielonet commented 1 month ago

@realFlowControl Thanks a lot for your support ! Good luck fixing the bug (which is no longer too urgent).