irmen / Pyro5

Pyro 5 - Python remote objects
https://pyro5.readthedocs.io
MIT License
305 stars 36 forks source link

oneway method call followed by _pyroRelease doesn't get called #20

Closed samschott closed 4 years ago

samschott commented 4 years ago

When a oneway call is followed immediately by _pyroRelease, the server method never gets called. This only occurs with communication over a unix domain socket and seems to be a regression from v5.8 to v5.9. This is an issue for instance when using the Proxy as a context manager when the last call is a oneway call before leaving the context.

An example to reproduce this, adapted from examples/oneway/server2:

Server:

import time
import threading
from Pyro5.api import expose, oneway, behavior, serve, Daemon

@expose
@behavior("single")
class Server(object):
    def __init__(self):
        self.counter = 0

    @oneway
    def increment_oneway(self):
        print("oneway call executing in thread", threading.get_ident())
        time.sleep(0.5)
        self.counter += 1

    def increment(self):
        time.sleep(0.5)
        self.counter += 1

    def getcount(self):
        return self.counter

with Daemon(unixsocket='/Users/samschott/Desktop/example.sock') as daemon:
    daemon.register(Server, "example")
    daemon.requestLoop()

Client:

This won't work:

from Pyro5.api import Proxy

with Proxy("PYRO:example@./u:/Users/samschott/Desktop/example.sock") as serv:
    serv.increment_oneway()

But giving it some time does work:

import time
from Pyro5.api import Proxy

with Proxy("PYRO:example@./u:/Users/samschott/Desktop/example.sock") as serv:
    serv.increment_oneway()
    time.sleep(0.1)
irmen commented 4 years ago

There's no change since 5.8 that looks like it was of influence on this behavior. I suspect that you just "were lucky" earlier that you weren't running into this race condition.

That said, I'm not sure what could be causing this. A oneway call (on the client) is just a regular call without waiting and processing the response, but the request is sent just as a normal call. This means once the increment_oneway() method returns, the request message has been sent fully to the server already. Closing down the proxy/socket after that should be of no consequence. I'll see if I can reproduce your finding.

Does it only occur on unix sockets? or also on tcp/ip sockets? And does it only occur with oneway calls?

samschott commented 4 years ago

I am similarly confused about this, but it is very reproducible when updating from 5.8 to 5.9 or 5.9.1. Something about the oneway call on the client side seems to be asynchronous.

Just for context, I am seeing this on Python 3.7.7 an macOS Catalina.

irmen commented 4 years ago

Thanks for the sample code btw. I can't reproduce the issue on my system right now (Linux, python 3.8), so this is not going to be easy...

irmen commented 4 years ago

Could you add debug logging to your server and let us know what gets printed in the succesful case and the failing case? Simply set these environment variables before launching the server:

export PYRO_LOGLEVEL=DEBUG
export PYRO_LOGWIRE=1
export PYRO_LOGFILE={stderr}
samschott commented 4 years ago

To answer your previous questions: I see this only on unix sockets and oneway calls, tcp/ip sockets and regular calls work just fine.

I’ll get some debug output for you later today.

irmen commented 4 years ago

Ok. As an experiment, could you also perhaps try this little modification to Pyro's client code to see if this makes any difference on MacOS perhaps?

diff --git a/Pyro5/client.py b/Pyro5/client.py
index ca0e137..b8d5257 100644
--- a/Pyro5/client.py
+++ b/Pyro5/client.py
@@ -4,6 +4,7 @@ Client related classes (Proxy, mostly)
 Pyro - Python Remote Objects.  Copyright by Irmen de Jong (irmen@razorvine.net).
 """

+import os
 import time
 import logging
 import serpent
@@ -219,6 +220,9 @@ class Proxy(object):
             self._pyroConnection.send(msg.data)
             del msg  # invite GC to collect the object, don't wait for out-of-scope
             if flags & protocol.FLAGS_ONEWAY:
+                if self._pyroConnection.family() == "Unix":
+                    with os.fdopen(self._pyroConnection.fileno(), "ab", 0, closefd=False) as ff:
+                        ff.flush()   # make sure the request is written to the unix socket before returning
                 return None  # oneway call, no response data
             else:
                 msg = protocol.recv_stub(self._pyroConnection, [protocol.MSG_RESULT])
samschott commented 4 years ago

Here is the debug output (server side) in the failed case:

[2020-04-19 18:18:49.681,Pyro5,INFO] Pyro log configured using built-in defaults, level=DEBUG
[2020-04-19 18:18:49.741,Pyro5.server,DEBUG] exposing all members of <class 'Pyro5.server.DaemonObject'>
[2020-04-19 18:18:49.742,Pyro5.server,DEBUG] exposing all members of <class 'Pyro5.nameserver.NameServer'>
[2020-04-19 18:18:49.743,Pyro5.server,DEBUG] exposing all members of <class '__main__.Server'>
[2020-04-19 18:18:49.744,Pyro5.threadpoolserver,INFO] starting thread pool socketserver
[2020-04-19 18:18:49.744,Pyro5.threadpoolserver,INFO] not using SSL
[2020-04-19 18:18:49.745,Pyro5.threadpoolserver,DEBUG] worker pool created with initial size 4
[2020-04-19 18:18:49.746,Pyro5.server,DEBUG] daemon created on ./u:/Users/samschott/Desktop/example.sock - Unix (pid 815)
[2020-04-19 18:18:49.746,Pyro5.server,DEBUG] pyro protocol version: 502
[2020-04-19 18:18:49.746,Pyro5.server,INFO] daemon ./u:/Users/samschott/Desktop/example.sock entering requestloop
[2020-04-19 18:18:49.746,Pyro5.threadpoolserver,DEBUG] threadpool server requestloop
[2020-04-19 18:20:29.088,Pyro5.threadpoolserver,DEBUG] connected  - unencrypted
[2020-04-19 18:20:29.089,Pyro5.threadpoolserver,DEBUG] worker counts: 1 busy, 3 idle
[2020-04-19 18:20:29.090,Pyro5.server,DEBUG] daemon handshake received: msgtype=1 flags=0x0 ser=1 seq=0 num_annotations=0 corr_id=00000000-0000-0000-0000-000000000000
data=b"# serpent utf-8 python3.2\n{'handshake':'hello','object':'example'}"
[2020-04-19 18:20:29.091,Pyro5.server,DEBUG] daemon handshake response: msgtype=2 flags=0x0 ser=1 seq=0 num_annotations=0 corr_id=ad42f33a-da3d-449e-beef-fa5159ada5a4
data=b"PYRO\x01\xf6\x02\x01\x00@\x00\x00\x00\x00\x00\x97\x00\x00\x00\x00\xadB\xf3:\xda=D\x9e\xbe\xef\xfaQY\xad\xa5\xa4\x00\x00M\xc5# serpent utf-8 python3.2\n{'handshake':'hello','meta':{'methods':{'increment','increment_oneway','getcount'},'oneway':{'increment_oneway'},'attrs':()}}"
[2020-04-19 18:20:29.093,Pyro5.server,DEBUG] daemon wiredata received: msgtype=4 flags=0x4 ser=1 seq=1 num_annotations=0 corr_id=00000000-0000-0000-0000-000000000000
data=b"# serpent utf-8 python3.2\n('example','increment_oneway',(),{})"
[2020-04-19 18:20:29.094,Pyro5.server,DEBUG] instancemode single: creating new pyro object for <class '__main__.Server'>
[2020-04-19 18:20:29.094,Pyro5.threadpoolserver,DEBUG] disconnected 
[2020-04-19 18:20:29.094,Pyro5.threadpoolserver,DEBUG] worker counts: 0 busy, 4 idle

And in the successful case:

[2020-04-19 18:22:05.648,Pyro5,INFO] Pyro log configured using built-in defaults, level=DEBUG
[2020-04-19 18:22:05.707,Pyro5.server,DEBUG] exposing all members of <class 'Pyro5.server.DaemonObject'>
[2020-04-19 18:22:05.708,Pyro5.server,DEBUG] exposing all members of <class 'Pyro5.nameserver.NameServer'>
[2020-04-19 18:22:05.709,Pyro5.server,DEBUG] exposing all members of <class '__main__.Server'>
[2020-04-19 18:22:05.710,Pyro5.threadpoolserver,INFO] starting thread pool socketserver
[2020-04-19 18:22:05.710,Pyro5.threadpoolserver,INFO] not using SSL
[2020-04-19 18:22:05.711,Pyro5.threadpoolserver,DEBUG] worker pool created with initial size 4
[2020-04-19 18:22:05.712,Pyro5.server,DEBUG] daemon created on ./u:/Users/samschott/Desktop/example.sock - Unix (pid 827)
[2020-04-19 18:22:05.712,Pyro5.server,DEBUG] pyro protocol version: 502
[2020-04-19 18:22:05.712,Pyro5.server,INFO] daemon ./u:/Users/samschott/Desktop/example.sock entering requestloop
[2020-04-19 18:22:05.712,Pyro5.threadpoolserver,DEBUG] threadpool server requestloop
[2020-04-19 18:22:30.239,Pyro5.threadpoolserver,DEBUG] connected  - unencrypted
[2020-04-19 18:22:30.240,Pyro5.threadpoolserver,DEBUG] worker counts: 1 busy, 3 idle
[2020-04-19 18:22:30.240,Pyro5.server,DEBUG] daemon handshake received: msgtype=1 flags=0x0 ser=1 seq=0 num_annotations=0 corr_id=00000000-0000-0000-0000-000000000000
data=b"# serpent utf-8 python3.2\n{'handshake':'hello','object':'example'}"
[2020-04-19 18:22:30.241,Pyro5.server,DEBUG] daemon handshake response: msgtype=2 flags=0x0 ser=1 seq=0 num_annotations=0 corr_id=26b54343-1b08-4373-9b40-1256c5ab76e9
data=b"PYRO\x01\xf6\x02\x01\x00@\x00\x00\x00\x00\x00\x97\x00\x00\x00\x00&\xb5CC\x1b\x08Cs\x9b@\x12V\xc5\xabv\xe9\x00\x00M\xc5# serpent utf-8 python3.2\n{'handshake':'hello','meta':{'methods':{'increment_oneway','increment','getcount'},'oneway':{'increment_oneway'},'attrs':()}}"
[2020-04-19 18:22:30.243,Pyro5.server,DEBUG] daemon wiredata received: msgtype=4 flags=0x4 ser=1 seq=1 num_annotations=0 corr_id=00000000-0000-0000-0000-000000000000
data=b"# serpent utf-8 python3.2\n('example','increment_oneway',(),{})"
[2020-04-19 18:22:30.244,Pyro5.server,DEBUG] instancemode single: creating new pyro object for <class '__main__.Server'>
oneway call executing in thread 123145398317056
[2020-04-19 18:22:30.343,Pyro5.threadpoolserver,DEBUG] disconnected 
[2020-04-19 18:22:30.343,Pyro5.threadpoolserver,DEBUG] worker counts: 0 busy, 4 idle

Those two outputs look identical to me apart from the line oneway call executing in thread 123145398317056 which is printed by the exposed method itself.

Unfortunately, the patch does not make any difference. This makes sense since the method call seems to be received by the server in both cases.

samschott commented 4 years ago

I went through the commits between v5.8 and v5.9 to see when things stop working. It is commit 4d860a8, so apparently a server-side issue. In particular, it is line 482:

_OnewayCallThread(method, vargs, kwargs, self, conn.sock.getpeername()).start()

I suspect it is the call conn.sock.getpeername() which is failing, possibly because the connection is already closed. Passing pyro_client_sock=None instead fixes the issue (as long as the exposed method call does not raise any exceptions...).

Is it really necessary to handle errors in oneway calls? Shouldn't the call just fail silently to be truly oneway?

irmen commented 4 years ago

Thank you for digging through the commits and finding the culprit. Apparently the getpeername call is problematic on macOs on unix sockets. I'll try to investigate.

Yeah it's necessary to handle errors even in oneway calls - the error handler that was introduced in that commit is meant to monitor any error in the server, oneway call or not. It's not an error handler that is returning the error to the client.

samschott commented 4 years ago

No worries, since I am seeing the error it is a lot easier for me to track down the corresponding code.

Yeah it's necessary to handle errors even in oneway calls - the error handler that was introduced in that commit is meant to monitor any error in the server, oneway call or not.

Yes, I have noticed this as well now.

Apparently, you (or someone else) were already somewhat aware of problems with getpeername. I found the following exception handling in handleRequest a few lines further up:

https://github.com/irmen/Pyro5/blob/df77ec538b3d3b460f9e3dc17541854a63cd2a61/Pyro5/server.py#L420-L424

irmen commented 4 years ago

Indeed, so I'm now proposing the following three small modifications instead that avoid the extra call to getpeername() Can you try these?

diff --git a/Pyro5/server.py b/Pyro5/server.py
index ca2c4ce..d0f6cc1 100644
--- a/Pyro5/server.py
+++ b/Pyro5/server.py
@@ -439,7 +439,7 @@ class Daemon(object):
                         try:
                             result = method(*vargs, **kwargs)  # this is the actual method call to the Pyro object
                         except Exception as xv:
-                            self.methodcall_error_handler(self, conn.sock.getpeername(), method, vargs, kwargs, xv)
+                            self.methodcall_error_handler(self, current_context.client_sock_addr, method, vargs, kwargs, xv)
                             xv._pyroTraceback = errors.format_traceback(detailed=config.DETAILED_TRACEBACK)
                             data.append(core._ExceptionWrapper(xv))
                             break  # stop processing the rest of the batch
@@ -459,13 +459,13 @@ class Daemon(object):
                         if request_flags & protocol.FLAGS_ONEWAY:
                             # oneway call to be run inside its own thread, otherwise client blocking can still occur
                             #    on the next call on the same proxy
-                            _OnewayCallThread(method, vargs, kwargs, self, conn.sock.getpeername()).start()
+                            _OnewayCallThread(method, vargs, kwargs, self, current_context.client_sock_addr).start()
                         else:
                             isCallback = getattr(method, "_pyroCallback", False)
                             try:
                                 data = method(*vargs, **kwargs)  # this is the actual method call to the Pyro object
                             except Exception as xv:
-                                self.methodcall_error_handler(self, conn.sock.getpeername(), method, vargs, kwargs, xv)
+                                self.methodcall_error_handler(self, current_context.client_sock_addr, method, vargs, kwargs, xv)
                                 raise
                             if not request_flags & protocol.FLAGS_ONEWAY:
                                 isStream, data = self._streamResponse(data, conn)
samschott commented 4 years ago

That solves it! Thanks :)

Of course, if you would like to always pass the client socket to the error handler, there may be a way to save it when the message is received. I haven't looked into the code to see how difficult this would be...

irmen commented 4 years ago

Nice. I'll fix it properly soon then. Nah, it's okay, I don't quite remember the reason why that comment was there about getpeername() but apparently there are situations where it fails. It's nice when it succeeds but I'm okay with not knowing the client connected in certain error scenarios.