rialto-php / rialto

Manage Node resources with PHP
MIT License
170 stars 76 forks source link

Support parallel calls #9

Closed nesk closed 6 years ago

nesk commented 6 years ago

The problem

It's currently impossible to run multiple Node instructions in parallel. Adding this feature would be great for some Rialto implementations, like PuPHPeteer, where making multiple page navigations in parallel would save time.

Proposed API

Due to the synchrone nature of PHP, it's impossible to run and wait for multiple instructions at the same time. However, it is possible to store the instructions and send them all at once to Node, which will answer only once all the instructions are done.

Here's what the API could look like (using PuPHPeteer as the implementation):

$browser = (new Puppeteer)->launch();

$page1 = $browser->newPage();
$page2 = $browser->newPage();

[$response1, $response2] = Puppeteer::parallelize(function () use ($page1, $page2) {
    $page1->goto('https://github.com/nesk/');
    $page2->goto('https://github.com/-not--a--real--profile-/');
});

echo $response1->status(); // 200
echo $response2->status(); // 404

How it works

The workflow is the following:

  1. The user starts parallelization (Rialto switches ON a flag available to all resources).
  2. All the instructions made by the user are stored in a buffer instead of being immediately sent to the Node process.
  3. The user stops parallelization (Rialto switches OFF the flag).
  4. The process supervisor sends all the previously stored instructions at once to the Node process (in a special parallelization data container).
  5. The Node process executes all the instructions and returns all of their values at once.
  6. The user can now read the values returned by all the instructions, in the order they have been executed.

The API can be confusing

Since instructions aren't executed immediately, it is not possible to chain them while in parallelization, this will not work:

Puppeteer::parallelize(function () use ($page1) {
    echo $page1->goto('https://github.com/nesk/')->status();
});

This is not a solvable issue, however we must properly warn the user. To address this, we could make the parallelized instructions return a PendingParallelResource which throws an exception when trying to make any call/get/set on it.


Original idea from nesk/rialto#2 by @mdeora

nesk commented 6 years ago

Closing in favor of nesk/rialto#13.