Closed jakkdl closed 11 months ago
I think before deciding how to deal with that, it'd be good to try and categorise the methods - which should an implementation always need to have defined, are there groups which should be defined together? Also how many could have a default implementation, beyond raise NotImplementedError
?
A 5th option might be to use a bunch of protocols, though there'd be a little bit of a runtime cost. Make SocketType
a Protocol
with the most basic methods, then have some additional protocols inheriting from that for any other "group" of methods. Code could use either @runtime_checkable
or a TypeGuard
using hasattr()
to detect if it applies. That won't help with verifying that they're valid for set_custom_socket_factory
though.
I think 1. is nice, because that's essentially the only method that I can think of that is both palatable and has a solid deprecation path (i.e. make certain methods abstract only at type checking time + check that they're non-abstract at initialization time, otherwise warn).
Ah, that sounds like a good solution and relatively straightforward.
It will lead to some minor API breakages if people are doing very fancy stuff in their classes inheriting from SocketType
, but that shouldn't be overly common?
The best way of categorizing methods is probably just to start writing a PR to get an overview. If anybody wants to take a stab feel free, otherwise I'll give it a go.
Option 5: Vaporise SocketType
and stop exposing raw BSD sockets. This would have to be done for uring support anyway.
Huh, what kind of interface changes are we looking at for that sort of removal? I'm not too familiar with what trio offers w/r/t sockets.
Why would io_uring force us to stop exposing send/recv/etc? There are ring ops for all those, even if we do want to avoid the readiness APIs for some reason.
On Sun, Aug 6, 2023, 17:22 EXPLOSION @.***> wrote:
Huh, what kind of interface changes are we looking at for that sort of removal? I'm not too familiar with what trio offers w/r/t sockets.
— Reply to this email directly, view it on GitHub https://github.com/python-trio/trio/issues/2720#issuecomment-1667018760, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEU42ARXX523PB5LCJR373XUAYMBANCNFSM6AAAAAA24NREGA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
BSD sockets is more than just send/recv
, it has the awful type-punning socket address and socket option APIs. But aside from that, uring linked requests is intrinsically incompatible with the socket.socket
API of single-call waits. If you extend it, it's no longer a stdlib socket inside a wolf's suit, so why bother pretending to be one?
bind
-> multishot accept call with the socket API. shutdown
-> close
call which avoids going into the event loop twice to do the same thing (shutdown is not async. good luck doing a deprecation cycle for that!).If we wanted direct sockets (a good feature), we'd then have to use the io_uring
calls for sockopt, sockaddr, etc, which means either copying all the validation logic over from the terrible stringly-typed socket.socket
API so that somebody doeesn't pass in a too small buffer and segfaults. Or, better, make a proper strongly-typed API that can be statically typed to avoid these issues but is once again incompatible with stdlib sockets (so we should stop pretending to be one).
(We're not going to remove the current socket api, even if we do add fancy io_uring related features on top of it in the future, so for this issue it doesn't matter too much. @Fuyukai do you want to open a separate issue for possible new socket operations to expose via io_uring? E.g. you can't express a multishot accept with the current API, but we can certainly add a multishot accept API. idk what a shutdown
->close
is tho. And fyi it's totally legal to have a single socket that's both registered for direct io and has a regular fd for regular syscalls, if that's more convenient.)
Anyway for SocketType
, I think we want to basically make it
class SocketType:
def __init__(self, [appropriate types here]):
raise NotImplementedError
def recv(self) -> bytes:
raise NotImplementedError
# etc.
That makes it legible to type checkers, while keep all the current runtime behavior the same.
I think before deciding how to deal with that, it'd be good to try and categorise the methods - which should an implementation always need to have defined, are there groups which should be defined together? Also how many could have a default implementation, beyond raise NotImplementedError?
I think SocketType
needs to have everything, unless we overhaul how SocketFactory
works, since the return type of all methods will be SocketType
, and _SocketType
will never be seen in the public API, so if they ever want to call e.g. recvmsg_into
then it has to be part of the public API of SocketType
.
Default implementations could be made for all the pure wrapper methods though, if they check that the user-made SocketType
does have an underlying stdlib socket.
Something like
class SocketType:
def __init__(self):
if type(self) == SocketType:
raise TypeError("abstract class, use trio.socket.socket...")
self._socket: _stdlib_socket.socket | None = None
def detach(self) -> int:
if self._sock is None:
raise NotImplementedError("No default implementation unless self._socket is initialized")
return self._socket.detach()
For that case, what we do is make a protocol that has _socket
defined, and annotate self
with that. Then type checkers will complain if any of the defaults are used but that attribute is not present.
For that case, what we do is make a protocol that has
_socket
defined, and annotateself
with that. Then type checkers will complain if any of the defaults are used but that attribute is not present.
@TeamSpen210 just slapping on a HasSocket
protocol to the self
argument is turning out to be far from trivial, due to lack of Intersection
it warrants creating another type SocketTypeWithSocket
which looks like it ends up being the type used pretty much everywhere in the internal code, since out of 30 methods on SocketType
only 7 of them are more than a simple wrapper and left with a NotImplementedError
, and the others all require HasSocket
. Pretty much all user code is going to need to be typed as using SocketTypeWithSocket
(or _SocketType
).
If going in this direction I think it's probably better to just go all the way and fully merge the types, or expose _SocketType
as a public type that can be specified.
That makes it legible to type checkers, while keep all the current runtime behavior the same.
FTR: The raise NotImplementedError
certainly doesn't make all runtime behavior the same. Any code doing dynamic inspection on socket types with hasattr/getattr/dir
, which isn't that weird given that the current trio tests already do that for platform differences, could get messed up. Also any code that does multiple inheritance and uses super()
calls will now raise NotImplementedError
.
Ah, I suppose there's always
class SocketType:
if TYPE_CHECKING:
def recv(self, ...) -> bytes:
...
def fileno(self) -> int:
...
if we want to guarantee no impact on runtime.
I'm having trouble following what people are trying to accomplish here – what's wrong with my proposal above?
On Mon, Aug 28, 2023, 07:37 John Litborn @.***> wrote:
Ah, I suppose there's always
class Socket if TYPE_CHECKING: def recv(self, ...) -> bytes: ... def fileno(self) -> int: ...
if we want to guarantee no impact on runtime.
— Reply to this email directly, view it on GitHub https://github.com/python-trio/trio/issues/2720#issuecomment-1695817617, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEU42BIMWFI5IGLYTLWYKTXXSULRANCNFSM6AAAAAA24NREGA . You are receiving this because you commented.Message ID: @.***>
I'm having trouble following what people are trying to accomplish here – what's wrong with my proposal above?
Your proposal can affect runtime behaviour, though admittedly only in weird cases. But I mostly just see no upside of raise RuntimeError
vs hiding behind TYPE_CHECKING
, unless you specifically want to catch people doing weird things with SocketType
- which might make sense if long-term plan is to make SocketType
abstract, or merging the two types.
Going into protocols and/or default implementations seemed like a good idea, but is turning out to be really messy and I'm not going to bother trying that.
Conceptually SocketType
is an ABC
. It's just that since the API is huge and the main use case for implementing it is for tests, I bowed to practicality and made all the abstract methods optional -- the socket API is huge and any given user only touches a small fraction of it, and when you're writing a test fake, there's no point in making you fill in exotic methods you don't use. Plus, if you accidentally leave out an important method, the exact error doesn't matter so long as you get some error and fix your test; these aren't errors that you're catching and handling at runtime. And since this was before static type checking was a thing, I didn't bother typing in all the optional abstract methods, since all it would do is convert AttributeError
into NotImplementedError
and who cares?
Anyway that's the background.
Agreed that the difference between raise NotImplementedError
and if TYPE_CHECKING
is pretty minimal; I was more confused at how we were getting onto discussions about abstract protocols with _socket
attributes and stuff :-).
I think I do have a weak preference for raise NotImplementedError
over if TYPE_CHECKING
, simply because having the same code for runtime and static checking is one less thing you have to think about when figuring out what's going on, and in this case I don't think the runtime difference between AttributeError
vs NotImplementedError
is important or worth preserving -- arguably it should have been NotImplementedError
all along, just I was too lazy to bother :-) But either seems workable.
SocketType
is currently an empty type and what's meant for end-users to inherit from to show compatibility with trio sockets, but_SocketType
(which inherits fromSocketType
is what's actually used internally and has all the methods needed to get code to actually pass type checking.So we probably want to merge these one way or another, I started trying to tackle it in #2705 but all my ideas hit problems and will likely have some effect on user code that inherits from either of them, so I think it deserves it's own issue + PR.
In tests we have three different
FakeSocket
classes inheriting fromSocketType
: test_highlevel_socket.py, test_highlevel_open_tcp_stream.py and test_highlevel_open_tcp_listeners.pyOptions include:
SocketType
an abstract base class with abstract methods. Will break any code that has methods inheriting from it and not defined - so would need to only mark the methods that are absolutely required to be abstract, and others could instead raiseNotImplementedError
. *. Will almost certainly raise a lot of type checking errors.SocketType
. I think it's impossible to avoid this being a breaking API change.SocketType = _SocketType
. *. will raise a ton of type checking errors, might be possible to reduce the number with some effort._SocketType
something likeInternalTrioSocketType
and have that be the class used in signatures that require some methods to be available.