reactphp / socket

Async, streaming plaintext TCP/IP and secure TLS socket server and client connections for ReactPHP.
https://reactphp.org/socket/
MIT License
1.2k stars 156 forks source link

TcpConnector - are writes buffered or sent immediately? #262

Closed ondrejmirtes closed 3 years ago

ondrejmirtes commented 3 years ago

Hi, I'm debugging something in PHPStan and I'd like to know if the writes made on the ConnectionInterface are buffered in any way (if they're waiting to be sent on the next loop tick etc.) or if they're sent synchronously - immediately.

$tcpConector = new \React\Socket\TcpConnector($loop);
$tcpConector->connect(sprintf('127.0.0.1:%d', $port))->done(function (ConnectionInterface $connection): void {
    // I'm asking about this:
    $connection->write(...);
});

I'm kind of lost on debugging this by myself because of the complex object hierarchy, many interface implementations etc.

Thank you.

WyriHaximus commented 3 years ago

They are buffered until the kernel notifies us we can write it onto the socket.

Refs: https://github.com/reactphp/stream/blob/7a423506ee1903e89f1e08ec5f0ed430ff784ae9/src/WritableResourceStream.php#L60-L75 Refs: https://github.com/reactphp/stream/blob/7a423506ee1903e89f1e08ec5f0ed430ff784ae9/src/WritableResourceStream.php#L116-L167

ondrejmirtes commented 3 years ago

I realized what's probably happening from further debugging:

$connection->on('data', function () {
    foreach ($list as $job) {
        // do some work, send occasional ping every few seconds
        $connection->write('ping');
    }

    // send results, free the loop
    $connection->write('result');
});

What happens is that all the pings gets send along with the result, so the actual output looks like this:

Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge pinged
Child etzfaz2jge delivered results
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 pinged
Child wgbbc9ul49 delivered results

Instead of being more interlocked and random.

Is there any way to write from inside the data handler and have it sent immediately?

WyriHaximus commented 3 years ago

Is there any way to write from inside the data handler and have it sent immediately?

No. But looking a the referenced issue https://github.com/phpstan/phpstan/issues/5341 I have the suspension you're child is busy doing something when it gets the ping and wont respond to it right away.

How are you distributing tasks to your children? Is it all of them at the start, or are children requesting new tasks when done? The latter would be more efficient, and enforce two way confirmations when communicating.

ondrejmirtes commented 3 years ago

There's the main process (TCP server) and the children (TCP clients). The TCP clients ping the server with the code that's roughly this:

$connection->on('data', function () {
    foreach ($list as $job) {
        // do some work, send occasional ping every few seconds
        $connection->write('ping');
    }

    // send results, free the loop
    $connection->write('result');
});

Children are given jobs of default size 20 (analyse 20 files). Once they're done with a job, they're given the next job. I believe it's efficient and fair.

There's a timeout implemented using loop timer that's canceled once the child responded with a job result. The timeout is there in case a child process gets hanged on something and stops responding. My idea was to ping the main process from the child not only when the whole job is done but also to check after each file if it was >1 second since last ping and ping in that case too. That would bring the number of timeout errors down dramatically.

Ideally I'd love WritableResourceStream implementation that fwrite directly in write and not in handleWrite when the loop asks for that. It'd be easy to do that, but it'd be harder to get the implementation deep in the guts of TcpConnector where the original WritableResourceStream is hardcoded.

WyriHaximus commented 3 years ago

You don't need an event loop in the child I think to be honest. It's only doing one thing, and you want the insurance that the message is delivered. Alternatively you can do a ping pong with a code where the child pings with a code and the parent has to respond with a pong and that code to ensure that the parent process knows the child is still alive and the child knows for sure the buffer has been flushed.

ondrejmirtes commented 3 years ago

I might not need the loop but it was kind of nice to have React.PHP and specifically https://github.com/clue/reactphp-ndjson on both sides.

Do you happen to know some nice TCP abstraction library that would let me send messages to the server and wait for messages from the server?

WyriHaximus commented 3 years ago

Not really, any I/O I write is non-blocking these days.

ondrejmirtes commented 3 years ago

I'm gonna try https://github.com/hoaproject/Socket if I feel that rewriting the client is worth it :) Thank you for your help.

kelunik commented 3 years ago

I think PHP could benefit from low level functions exposing which bytes have been ACKed from the other TCP side. But for this case it's enough if the bytes are written to the socket, because everything else is handled by the kernel. Does React currently not expose an event for this?

WyriHaximus commented 3 years ago

@kelunik We don't expose that information because all write calls on a stream/socket are tossed on the buffer and we try to write as much as possible then so we don't track that information at all.