UDP stack compatible with embassy-net; socket splitting

ivmarkov commented 7 months ago

Addresses #103

As per the subject, this PR contains two changes (I can split into two separate PRs as well, but given that both would be backwards incompatible, and are touching the same traits, perhaps it is better to discuss those in one go):

The PR introduces lifetimes in the UdpStack::Connected, UdpStack::UniquelyBound and UdpStack::MultiplyBound associated types, making those GATs
- The change is similar and following the lead as to how the the TcpConnect::Connection type is modeled already
- It is necessary, so that the e-nal-async UdpStack trait is implementable in embassy-net and possibly other non-allocating network stacks using an identical approach to the already implemented TcpConnect factory trait
The second change in this PR is that it puts an extra requirement on the UdpSocket factory trait. Namely, that the sockets it returns should be splittable into two separate halves - "receive" and "send" which can be used concurrently in app protocols that do require full-duplex UDP datagram sending and receiving. This is done by:
- Splitting the ConnectedUdp and UnconnectedUdp existing traits into separate "send" and "receive" traits: ConnectedUdpReceive, ConnectedUdpSend, UnconnectedUdpReceive and UnconnectedUdpSend - similarly as to how the TcpConnect::Connection associated type is constrained by two separate traits for reading and writing from/to TCP sockets - namely - embedded_io_async::Read and embedded_io_async::Write
- Introducing two new traits: ConnectedUdpSplit and UnconnectedUdpSplit - which model the notion of a splittable connected/unconnected UDP socket by using Send/Receive lifetimed associated types.

Background on these changes, and justification for introducing UDP socket splitting can be found in #103

Open topics / next steps:

As per #103 I think we should come to an agreement that socket splitting is important and necessary.
Regardless of the outcome of the above, I think the separation of ConnectedUdp and UnconnectedUdp into "receive" and "send" parts stands on its own. Reason being - if a concrete network stack does allow UDP socket splitting, and even if we decide (regrettably) not to require this in the UdpSocket factory, app code that does need split UDP sockets would still benefit from the existence of separate "send" and "receive" traits, for the very same reasons we have separate Read and Write traits for TCP / streaming cases - i.e. the app code can take an "already splitted" in a platform-specific way UDP socket in the form of - e.g. async fn foo(send: impl UnconnectedUdpSend, recv: impl UnconnectedUdpReceive) and then proceed with polling these separately
(UPDATED) The existing traits' names are mouthful. How about the following simplification:
- Rename:
- UnconnectedUdpReceive -> UdpReceive
- UnconnectedUdpSend -> UdpSend
- UnconnectedUdpSplit -> UdpSplit
- Retire ALL of the Connected* traits and use the Unconnected* ones for the connected case as well. We are anyway already a bit in this situation, by re-using the Unconnected* traits for "single-bound" sockets, even though the API signatures of the unconnected traits are not a perfect fit there, as they take / return the local socket addr, which serves no useful purpose when the socket is already single-bound to a fixed local IP address.

ivmarkov commented 7 months ago

@chrysn @MathiasKoch Would appreciate comments / feedback here. :) As in, am I on the right track? any concerns / suggestions?

chrysn commented 7 months ago

I think the lifetime change is on the right track.

The splitting worries me -- while so far RIOT sends instantly, once they allow network interface backpressure (there were bugs due to the lack of it), splitting becomes messy at best.

Unified types IMO would require per-type send and receive AT from https://github.com/rust-embedded-community/embedded-nal/issues/92 -- so lets do that :-)

ivmarkov commented 7 months ago

The splitting worries me -- while so far RIOT sends instantly, once they allow network interface backpressure (there were bugs due to the lack of it), splitting becomes messy at best.

The problem with not enhancing UdpStack to return UdpSplit types is that folks wouldn't be able to use this factory for any non-request-response app protocol that lives on top of UDP, which (severely, IMO) limits its applicability.

Worse, if we don't at least have separate UdpSend and UdpReceive traits (similarly to embedded-io-async Read and Write) and continue to live with a single send+receive *Udp trait, folks in my situation wouldn't be able to use even the UDP socket trait - without the UdpStack factory - in their apps. As in - I don't see what aspect of the UDP functionality of embedded-nal-async I would be able to actually use in my mDNS and Matter use cases then?

Why is implementing splittable sockets on top of RIOT so complicated? Even if RIOT ends up having a single callback for both "you can try sending now" and "got incoming packet" and thus with a single waker registration slot, you could still split the socket and it should work fine for intra-task concurrency (i.e. where you use the send and receive halves within a single task) - without any increased CPU consumption whatsoever.

Unified types IMO would require per-type send and receive AT from <#92> -- so lets do that :-)

If I understand you correctly, you are suggesting - additionally to the changes suggested here and including the change where we end up with a single Udp(Send)/(Receive) (pair) of traits - to also enhance these traits with its own SocketAddr associated type?

If yes, then I can do that, but even before that, if you could comment on how do you feel of having a single Udp (or as per above and better yet - a single UdpReceive / UdpSend pair) for all of connected / multiply-bound / single-bound use cases even though local, remote or both socket addrs might be rendundant in 2 out of the 3 use cases?

chrysn commented 7 months ago

The problem with not enhancing UdpStack to return UdpSplit types is that ...

I'm not fundamentally opposed to this split -- I'll merely need to look at good examples of how this would be implemented in stacks such as RIOT, and I'll want to understand better how it applies to protocols such as mDNS which have request-response components but also independent transmissions. (Like, will applications require that the sender is Clone as well?).

Therefore, I'd propose to treat the concerns separately, as the lifetime change should be something that can be approved fast (I don't have the position to do it but can review).

If I understand you correctly, you are suggesting - additionally to the changes suggested here and including the change where we end up with a single Udp(Send)/(Receive) (pair) of traits - to also enhance these traits with its own SocketAddr associated type?

Rougly so, yes. Don't have a full plan yet, but the addresses would be TryFrom and Into core::net::SocketAddress (stable in 1.77, conveniently resolving an issue that so far blocked #92). They could be lifetime generic, and thus allow addresses just be newtypes around &Socket for connected sockets. Given that the API would look like UnconnectedSocket interfaces, there might even be a single AT for the (local, remote) tuple.

If yes, then I can do that, but even before that, if you could comment on how do you feel of having a single Udp (or as per above and better yet - a single UdpReceive / UdpSend pair) for all of connected / multiply-bound / single-bound use cases even though local, remote or both socket addrs might be rendundant in 2 out of the 3 use cases?

Yes, let's do that. Once we have ATs per socket, the reason for having Connected (and the half other type) goes away, as that was about not always passing around large addresses -- but for a connected type we wouldn't need to if its AT is as small as it can be.

chrysn commented 6 months ago

The Send/Receive split does sound like a good direction to go in, I've recently been shown another example of where it is convenient (although might really be the same example; it's rs-matter's mDNS server).

A colleague is working on imlementing split sockets on top of the current embedded-nal sockets. I don't think it'll be super efficient, as there may be a few synchronization primitives involved, but if such an implementation can be made, if any OS does not allow selecting for read and write separately (or placing separate callbacks for read and write ready), it can still use such a solution.

ivmarkov commented 6 months ago

A colleague is working on imlementing split sockets on top of the current embedded-nal sockets. I don't think it'll be super efficient, as there may be a few synchronization primitives involved, but if such an implementation can be made, if any OS does not allow selecting for read and write separately (or placing separate callbacks for read and write ready), it can still use such a solution.

Something like this?

Unfortunately only possible for UDP. I don't think that approach would work for streaming protocols like TCP. Not with the current async fn specification of the UDP send/receive and embedded_io_async read/write methods that is.

Also, the price is less convenience and one extra buffer.

A generic split implementation that does not need extra buffers and works for TCP as well is only possible if the Read/Write traits are defined in terms of the old poll based API, like the STD async read/write traits in futures, tokio and so on. IMO. But if you manage to pull it through somehow - I'm all ears. :)

ivmarkov commented 6 months ago

The Send/Receive split does sound like a good direction to go in, I've recently been shown another example of where it is convenient (although might really be the same example; it's rs-matter's mDNS server).

The mDNS responder in rs-matter and edge-net are the same, yes, modulo some insignificant differences. Both were created by me and @bjoernQ

chrysn commented 6 months ago

Something like this?

The splitter would not even need to store a packet. It'd suffice if there were a channel by which the recver and the sender can communicate and where they can share the underlying exclusive reference. The sender would attempt to take the shared reference, and on failure signals the recver to please relinquish control. The receiver would do recv racing against reception through the shared channel, and if the channel wins, cancell the receive, return the exclusive reference so that the sender can take it, and retry after the sender is done. Of course that only works for cancellable futures, but AIU it's the understanding that that is the case here. (May want to work on that in parallel).

Let's see how the implementation works out. As this is at least all more complex than the lifetime introduction, would you mind splitting that out, so that it can progress further?

[same] mDNS responder

I'm still having a hard time wrapping my head around why that particular application works better with the split. Granted, my CoAP implementation is more request-response style, but it implements both a server and a client at the same time, and while it is not implemented here, it could easily be extended to allow requests with no processed response (it's in the protocol, just not the implementation). Gotta have a closer look at the mDNS responder…

rust-embedded-community / embedded-nal

UDP stack compatible with embassy-net; socket splitting #106