Implement `SyncSender` and `AsyncSender`

leandro-lucarella-frequenz commented 1 year ago

What's needed?

The Sender class provides an async send() method, ~but most (if not all) our senders could currently send synchronously~ [^1]. Being async means that the control flow/CPU could be taken away from the caller, which is very inconvenient for cases where sync is guaranteed.

[^1]: Actually, with the removal of Bidirectional, now we only have 2 types of channels and 1 is sync (broadcast) and the other one is async (anycast). There is also an issue about having an option to make Broadcast senders block too.

Proposed solution

Split Sender into SyncSender and AsyncSender (probably make them protocols instead of classes). SyncSender would provide a sync send() and AsyncSender would provide a async asend().

Use cases

Data sourcing actor: https://github.com/frequenz-floss/frequenz-sdk-python/pull/225#discussion_r1123163181
The resampling actor (Resampler class) can also benefit from this, now the resampling functions need to be async because send() is async: https://github.com/frequenz-floss/frequenz-sdk-python/blob/ce6d19b869ed26c4df1e08a332f4e4fa4cbfc199/src/frequenz/sdk/timeseries/_resampling.py#L711-L716
There could be channels that only provide a sync or an async sender interface.
Broadcast channel could implement senders that support both operations, send() would drop like now and asend() would wait until the queue has room for the message.

Alternatives and workarounds

Currently the Data sourcing actor is just spawning tasks for sending and assuming they will eventually finish, and successfully, as there is no error handling for the spawned tasks and if they don't end, we'll be leaking dangling tasks forever.

Additional context

Even if we don't have AsyncSenders now, we will probably need them in the future, at least for this issue: #20
Experiment using sync send() in the Data sourcing actor:
- https://github.com/frequenz-floss/frequenz-sdk-python/pull/233
- https://github.com/frequenz-floss/frequenz-channels-python/pull/75

mathias-baumann-frequenz commented 1 year ago

Additional context: In the experimental branches, this change resulted in a speed up of the datasourcing benchmark from 30k/s to 70k/s samples

shsms commented 1 year ago

I think there are some issues with doing this:

We won't be able to pass around the same Sender everywhere. And independently developed libraries will have unnecessary compatibility issues because one expects sync and another one provides async.
we won't be able to support inter-process communication through tcp if we switch to sync. A lot of the async decisions were made to ensure that we'll have a seamless transition for actors, such that they would "just work" in multi-process contexts as soon as the sdk supports it.

Making it sync improves the performance in single-thread contexts, but it will limit our ability to scale beyond a single thread, in the future. So I'm against this change.

mathias-baumann-frequenz commented 1 year ago

I thought the idea was multi-process, not multi-thread? It's an important distinction here..

we won't be able to support inter-process communication through tcp if we switch to sync.

I doubt that statement to be honest. Right now, all send does is put the sample in a queue. If that queue is full, it drops the oldest sample. Networked or not, this never needs an async send.

There are various strategies to handle the network part, the most important aspect is whether we need reliability or not. If we do, it needs to be async of course, otherwise not.

I believe we have both requirements, some need reliability, some don't.

At the moment the data sourcing actor is simply infinitely creating tasks so this is basically just a rather inefficient queue with the only job to call one function which puts the sample.. in another queue.

leandro-lucarella-frequenz commented 1 year ago

we won't be able to support inter-process communication through tcp if we switch to sync. I doubt that statement to be honest. Right now, all send does is put the sample in a queue. If that queue is full, it drops the oldest sample.

These are both good points. If we have a send queue when we go networked, then we'll need a sending task that consumes from the queue, because right now is the receiver which consumes from the queue, but when there is a socket in the middle, that won't be the case.

But if we use a queue and we have (sync) send bursts, it will be very hard that the queue gets full and we lose messages, which is probably not what we want. Relegating the buffering to the socket might be a good approach.

But then, when we use TCP, then suddenly there is no more unreliable sending of messages, if we really want to provide some unreliable message sending approach that is more real-time, we should probably use UDP (and UDPs send_to() is sync).

Maybe the real discussion here, thinking about the long term (i.e. networked transport for channels), is if we want to really provide unreliable messages channels. We certainly will need reliable channels, to send power-setting commands for example. If we want unreliable channels, then I think having a sync send() still makes sense. The sync / async should be tied to that IMHO.

llucax commented 11 months ago

Actually, with the removal of Bidirectional, now we only have 2 types of channels and 1 is sync (Broadcast) and the other one is async (Anycast). There is also an issue about having an option to make Broadcast senders block too.

So I'm not sure if this is feasible or not, unless we have 2 different types of broadcast channels, one that blocks sends and one that doesn't, instead of making it just an option for the same channel.

Updating the issue description to mention this.

frequenz-floss / frequenz-channels-python