strongswan / strongswan

strongSwan - IPsec-based VPN
https://www.strongswan.org
Other
2.28k stars 776 forks source link

Python timeout #1562

Closed yadutaf closed 1 year ago

yadutaf commented 1 year ago

Since the merge of https://github.com/strongswan/strongswan/pull/1416, any timeout set directly on the socket is systematically reset by vici.Transport.receive() unless called by vici.Session.listen() thus making it impossible to configure some default timeout from the outside.

There could be at least 2 solutions for this:

  1. Add a notion of "default timeout" on the Session that would be forwarded and applied by the Transport and let the exception propagate.
  2. Add an explicit optional "timeout" parameter on all network method (vici.Session.request(), vici.Session.streamed_request()) as well as each of their wrapper in CommandWrappers.

If either of these look acceptable to the project, I can open a PR.

tobiasbrunner commented 1 year ago

Instead of unconditionally calling settimeout() in Transport::_recvall():

https://github.com/strongswan/strongswan/blob/d605584a7aaa8c18e51afa40710b8b0485e51cc2/src/libcharon/plugins/vici/python/vici/protocol.py#L34-L38

Could something like this work?

diff --git a/src/libcharon/plugins/vici/python/vici/protocol.py b/src/libcharon>
index 8793651882d6..4078944e77c2 100644
--- a/src/libcharon/plugins/vici/python/vici/protocol.py
+++ b/src/libcharon/plugins/vici/python/vici/protocol.py
@@ -31,11 +31,14 @@ class Transport(object):
     def _recvall(self, count, timeout=None):
         """Ensure to read count bytes from the socket"""
         data = b""
-        if count > 0:
+        if count > 0 and timeout is not None:
+            old_timeout = self.socket.gettimeout()
             self.socket.settimeout(timeout)
         while len(data) < count:
             buf = self.socket.recv(count - len(data))
-            self.socket.settimeout(None)
+            if timeout is not None:
+                self.socket.settimeout(old_timeout)
+                timeout = None
             if not buf:
                 raise socket.error('Connection closed')
             data += buf

I guess if passing None as timeout to listen() was a valid use case (e.g. if there was already a timeout set on the socket but listening should happen blocking for some reason), this required some special value as default (e.g. something like RECV_NO_TIMEOUT = object() on the protocol module level).

yadutaf commented 1 year ago

Your proposition has the additional benefit of preserving the API behavior. Introducing a special parameter like RECV_NO_TIMEOUT_CHANGE could guarantee that the behavior does not change at all from the previous version when no timeout is explicitly set.

Here is a variant of the proposition with this special constant. It tries to preserve the intention of https://github.com/strongswan/strongswan/pull/1416 which timeouts only for the first byte of the reply and then goes back to blocking mode. I'm not 100% this is desirable, though.

(This code could be simplified if it sets the same timeout for all read operations and moves the responsibility of dealing with timeouts and partial answers to the caller)

diff --git a/src/libcharon/plugins/vici/python/vici/protocol.py b/src/libcharon/plugins/vici/python/vici/protocol.py
index 879365188..2c0d5c854 100644
--- a/src/libcharon/plugins/vici/python/vici/protocol.py
+++ b/src/libcharon/plugins/vici/python/vici/protocol.py
@@ -8,6 +8,8 @@ from collections import OrderedDict
 from .exception import DeserializationException

+RECV_NO_TIMEOUT_CHANGE = object()
+
 class Transport(object):
     HEADER_LENGTH = 4
     MAX_SEGMENT = 512 * 1024
@@ -18,7 +20,7 @@ class Transport(object):
     def send(self, packet):
         self.socket.sendall(struct.pack("!I", len(packet)) + packet)

-    def receive(self, timeout=None):
+    def receive(self, timeout=RECV_NO_TIMEOUT_CHANGE):
         raw_length = self._recvall(self.HEADER_LENGTH, timeout)
         length, = struct.unpack("!I", raw_length)
         payload = self._recvall(length)
@@ -28,17 +30,25 @@ class Transport(object):
         self.socket.shutdown(socket.SHUT_RDWR)
         self.socket.close()

-    def _recvall(self, count, timeout=None):
+    def _recvall(self, count, timeout=RECV_NO_TIMEOUT_CHANGE):
         """Ensure to read count bytes from the socket"""
         data = b""
-        if count > 0:
+        old_timeout = RECV_NO_TIMEOUT_CHANGE
+        if timeout is not RECV_NO_TIMEOUT_CHANGE:
+            old_timeout = self.socket.gettimeout()
             self.socket.settimeout(timeout)
-        while len(data) < count:
-            buf = self.socket.recv(count - len(data))
-            self.socket.settimeout(None)
-            if not buf:
-                raise socket.error('Connection closed')
-            data += buf
+        try:
+            while len(data) < count:
+                buf = self.socket.recv(count - len(data))
+                if old_timeout is not RECV_NO_TIMEOUT_CHANGE:
+                    self.socket.settimeout(None)
+                if not buf:
+                    raise socket.error('Connection closed')
+                data += buf
+        finally:
+            if old_timeout is not RECV_NO_TIMEOUT_CHANGE:
+                self.socket.settimeout(old_timeout)
+
         return data
tobiasbrunner commented 1 year ago

Introducing a special parameter like RECV_NO_TIMEOUT_CHANGE could guarantee that the behavior does not change at all from the previous version when no timeout is explicitly set.

From your perspective, is it expected that Session::listen() will be blocking if no timeout is passed explicitly even if a default timeout is configured on the socket? Otherwise, this constant would also have to be used as default for timeout there. Or we just use None everywhere, which would honor the existing timeout, but then you wouldn't have the option to make listen() blocking by explicitly passing None if a timeout is already set.

It tries to preserve the intention of #1416 which timeouts only for the first byte of the reply and then goes back to blocking mode. I'm not 100% this is desirable, though

Yeah, I was wondering about that. But I now think this makes sense, because the current callers definitely can't deal with partial data. The patch with just using None as default would then become:

diff --git a/src/libcharon/plugins/vici/python/vici/protocol.py b/src/libcharon/plugins/vici/python/vici/protocol.py
index 8793651882d6..ba95b7e16bf2 100644
--- a/src/libcharon/plugins/vici/python/vici/protocol.py
+++ b/src/libcharon/plugins/vici/python/vici/protocol.py
@@ -31,14 +31,21 @@ class Transport(object):
     def _recvall(self, count, timeout=None):
         """Ensure to read count bytes from the socket"""
         data = b""
-        if count > 0:
+        old_timeout = None
+        if count > 0 and timeout is not None:
+            old_timeout = self.socket.gettimeout()
             self.socket.settimeout(timeout)
-        while len(data) < count:
-            buf = self.socket.recv(count - len(data))
-            self.socket.settimeout(None)
-            if not buf:
-                raise socket.error('Connection closed')
-            data += buf
+        try:
+            while len(data) < count:
+                buf = self.socket.recv(count - len(data))
+                if old_timeout is not None:
+                    self.socket.settimeout(None)
+                if not buf:
+                    raise socket.error('Connection closed')
+                data += buf
+        finally:
+            if old_timeout is not None:
+                self.socket.settimeout(old_timeout)
         return data
yadutaf commented 1 year ago

From your perspective, is it expected that Session::listen() will be blocking if no timeout is passed explicitly even if a default timeout is configured on the socket? Otherwise, this constant would also have to be used as default for timeout there

That's definitely something I missed in the patch above. The default argument of listen would indeed need to be changed.

Or we just use None everywhere, which would honor the existing timeout, but then you wouldn't have the option to make listen() blocking by explicitly passing None if a timeout is already set.

I'm not comfortable with using None as a timeout with a semantic different than self.socket.settimeout() This can create subtle mis-understanding and I tend to favor the "least surprise" principle.

But I now think this makes sense, because the current callers definitely can't deal with partial data

I share your point of view. And it highlights the limit of my current approach with a global timeout on the socket. To avoid the "partial data" issue, the socket could be systematically moved to blocking mode once the first bytes have been received.

Here is a possible variant where None keeps the semantic of socket.settimeout(), the default parameter is applied on listen() and we guarantee that no partial message will be read, even when a timeout was manually set on the socket:

diff --git a/src/libcharon/plugins/vici/python/vici/protocol.py b/src/libcharon/plugins/vici/python/vici/protocol.py
index 879365188..7679d29a1 100644
--- a/src/libcharon/plugins/vici/python/vici/protocol.py
+++ b/src/libcharon/plugins/vici/python/vici/protocol.py
@@ -8,6 +8,8 @@ from collections import OrderedDict
 from .exception import DeserializationException

+RECV_TIMEOUT_DEFAULT = object()
+
 class Transport(object):
     HEADER_LENGTH = 4
     MAX_SEGMENT = 512 * 1024
@@ -18,7 +20,7 @@ class Transport(object):
     def send(self, packet):
         self.socket.sendall(struct.pack("!I", len(packet)) + packet)

-    def receive(self, timeout=None):
+    def receive(self, timeout=RECV_TIMEOUT_DEFAULT):
         raw_length = self._recvall(self.HEADER_LENGTH, timeout)
         length, = struct.unpack("!I", raw_length)
         payload = self._recvall(length)
@@ -28,17 +30,22 @@ class Transport(object):
         self.socket.shutdown(socket.SHUT_RDWR)
         self.socket.close()

-    def _recvall(self, count, timeout=None):
+    def _recvall(self, count, timeout=RECV_TIMEOUT_DEFAULT):
         """Ensure to read count bytes from the socket"""
         data = b""
-        if count > 0:
+        old_timeout = self.socket.gettimeout()
+        if timeout is not RECV_TIMEOUT_DEFAULT:
             self.socket.settimeout(timeout)
-        while len(data) < count:
-            buf = self.socket.recv(count - len(data))
-            self.socket.settimeout(None)
-            if not buf:
-                raise socket.error('Connection closed')
-            data += buf
+        try:
+            while len(data) < count:
+                buf = self.socket.recv(count - len(data))
+                self.socket.settimeout(None)
+                if not buf:
+                    raise socket.error('Connection closed')
+                data += buf
+        finally:
+            self.socket.settimeout(old_timeout)
+
         return data

diff --git a/src/libcharon/plugins/vici/python/vici/session.py b/src/libcharon/plugins/vici/python/vici/session.py
index 79f2bda81..979b30aac 100644
--- a/src/libcharon/plugins/vici/python/vici/session.py
+++ b/src/libcharon/plugins/vici/python/vici/session.py
@@ -1,7 +1,7 @@
 import socket

 from .exception import SessionException, CommandException, EventUnknownException
-from .protocol import Transport, Packet, Message
+from .protocol import Transport, Packet, Message, RECV_TIMEOUT_DEFAULT
 from .command_wrappers import CommandWrappers

@@ -141,7 +141,7 @@ class Session(CommandWrappers, object):
                     )
                 )

-    def listen(self, event_types, timeout=None):
+    def listen(self, event_types, timeout=RECV_TIMEOUT_DEFAULT):
         """Register and listen for the given events.

         If a timeout is given, the generator produces a (None, None) tuple

Maybe a better version would be to deprecate direct socket access and expose a settimeout on the Transport and/or Session. Would you like me to open a PR ? It may be easier to iterate on the code.

tobiasbrunner commented 1 year ago

To avoid the "partial data" issue, the socket could be systematically moved to blocking mode once the first bytes have been received.

Yes, that sounds like a good idea.

Maybe a better version would be to deprecate direct socket access and expose a settimeout on the Transport and/or Session.

I suppose, but this would require quite some changes to Session. It currently only supports the default UNIX socket at /var/run/charon.vici (which might not even exist at that location if piddir was changed). Connecting to a socket at a different location or using TCP required additional constructors etc. (there is also a patch lying around that allows using AF_VSOCK sockets for vici). I feel just allowing users to pass the socket they want is more flexible.

Would you like me to open a PR ?

Yes, please do.