smol-rs / async-broadcast

Async broadcast channels
Apache License 2.0
163 stars 26 forks source link

Add CapacityLimiter trait #23

Closed danieldg closed 2 years ago

danieldg commented 2 years ago

This is a rework of #21 that doesn't split the implementation and allows more use cases.

There are some minor API changes in error handling, but otherwise default generic types allow the API to be identical if you do not use limiters. It also fixes an infinite loop in send if you have called set_capacity(0).

danieldg commented 2 years ago

Overall looks good to me. I wonder though if we can minimize the breakage and simplify further by leaving the interpretation of "empty" and "full" to CapacityLimiter implementers. Then we don't need specific errors inside errors.

Full has been wired up, but I'm not sure what you mean by leaving "empty" up to the limiter - an empty queue is empty regardless of what the CapacityLimiter has to say about it. Maybe the ByteCapacityError makes this clearer?

zeenix commented 2 years ago

Overall looks good to me. I wonder though if we can minimize the breakage and simplify further by leaving the interpretation of "empty" and "full" to CapacityLimiter implementers. Then we don't need specific errors inside errors.

Full has been wired up, but I'm not sure what you mean by leaving "empty" up to the limiter

I actually didn't have anything specific in mind about empty part. I was just thinking in general terms.

zeenix commented 2 years ago

Overall looks good to me. I wonder though if we can minimize the breakage and simplify further by leaving the interpretation of "empty" and "full" to CapacityLimiter implementers. Then we don't need specific errors inside errors.

Full has been wired up,

Has it? Perhaps I didn't explain well since you still have FullError in SendError. If interpretation of "full" is limiter-specific, why do we need another error type for it? SendError::Full just means channel is full and user doesn't need to care about more details than that?

danieldg commented 2 years ago

Overall looks good to me. I wonder though if we can minimize the breakage and simplify further by leaving the interpretation of "empty" and "full" to CapacityLimiter implementers. Then we don't need specific errors inside errors.

Full has been wired up,

Has it? Perhaps I didn't explain well since you still have FullError in SendError. If interpretation of "full" is limiter-specific, why do we need another error type for it? SendError::Full just means channel is full and user doesn't need to care about more details than that?

The example in SimpleMemoryCapacityLimiter is perhaps better to help understand why this is useful: the error can include information about why the channel was still full even if the sender was willing to block until there was room.

zeenix commented 2 years ago

Overall looks good to me. I wonder though if we can minimize the breakage and simplify further by leaving the interpretation of "empty" and "full" to CapacityLimiter implementers. Then we don't need specific errors inside errors.

Full has been wired up,

Has it? Perhaps I didn't explain well since you still have FullError in SendError. If interpretation of "full" is limiter-specific, why do we need another error type for it? SendError::Full just means channel is full and user doesn't need to care about more details than that?

The example in SimpleMemoryCapacityLimiter is perhaps better to help understand why this is useful: the error can include information about why the channel was still full even if the sender was willing to block until there was room.

I actually meant TrySendError::Full, that example code doesn't demonstrate the need for adding details to TrySendError::Full, does it?

danieldg commented 2 years ago

I actually meant TrySendError::Full, that example code doesn't demonstrate the need for adding details to TrySendError::Full, does it?

It doesn't do so explicitly, but you could inspect the error in the first failing try_broadcast call to produce an estimate of how long it will take to succeed (say this is a non-overflowing channel where it's known that one of the receivers is connected to a network link with relatively steady bandwidth).

zeenix commented 2 years ago

I actually meant TrySendError::Full, that example code doesn't demonstrate the need for adding details to TrySendError::Full, does it?

It doesn't do so explicitly, but you could inspect the error in the first failing try_broadcast call to produce an estimate of how long it will take to succeed (say this is a non-overflowing channel where it's known that one of the receivers is connected to a network link with relatively steady bandwidth).

That sound extremely specific to me and I'm not convinced it's something we need to cater for in async-broadcast.

danieldg commented 2 years ago

That sound extremely specific to me and I'm not convinced it's something we need to cater for in async-broadcast.

Then just ignore the error in the Full case. The error in the Rejected case is still useful, and if it's there then there is no additional cost to having it around in the Full variant.

zeenix commented 2 years ago

That sound extremely specific to me and I'm not convinced it's something we need to cater for in async-broadcast.

Then just ignore the error in the Full case.

Huh? We shouldn't add complexity to the API w/o valid justification.

The error in the Rejected case is still useful, and if it's there then there is no additional cost to having it around in the Full variant.

They are different cause the main point of Rejected for me is to ensure that user doesn't end up waiting forever to send a message. In case of Full, all user cares about if they want to wait (in which case they use send.await) or not (in which case they'll use try_send directly). The details of the Full aren't important IMO.

danieldg commented 2 years ago

They are different cause the main point of Rejected for me is to ensure that user doesn't end up waiting forever to send a message. In case of Full, all user cares about if they want to wait (in which case they use send.await) or not (in which case they'll use try_send directly). The details of the Full aren't important IMO.

We agree about Rejected and in the general use of Full, but I don't see how having the error available to Full complicates the API. The type is not more complex due to the error (it already needed it for Rejected), the value is still available via into_inner or by direct matching. Maybe it takes a few more characters in a match statement? Not sure it's quite that useless.

zeenix commented 2 years ago

I don't see how having the error available to Full complicates the API.

Would you still need the is_fatal method if the Rejected is always fatal and Full is never fatal?

danieldg commented 2 years ago

I don't see how having the error available to Full complicates the API.

Would you still need the is_fatal method if the Rejected is always fatal and Full is never fatal?

Yes, because the CapacityLimiter does not return TrySendError, it returns Self::Error.

zeenix commented 2 years ago

CapacityLimiter appears to exist so end-users can declare budgets in ways other than element count. But doing so today is no more than a one-liner:

Those lines assume the simplest case that will not be satisfactory for most use cases here: statically sized types. I'd actually argue that most messages will be dynamically sized (e.g in zbus, they come from another process and can container arbitrary data so can't possibly be statically-sized). Moreover, message-size based capacity limiter is only 1 of the possible limiters people would need but since it's a very typical one people would need, as implementation is being provided.

I understand that z-bus might need to be able to declare capacity in different ways, but the abstraction for that should then live at the z-bus layer (or some other, separate crate), but not here.

But how? AFAIK currently there is no way to implement different capacity limiters. If we could add the generic parts (CapacityLimiter trait and API around that) then users can do that for sure.

P.S. it's just zbus, w/o a -. :)

danieldg commented 2 years ago

CapacityLimiter appears to exist so end-users can declare budgets in ways other than element count. But doing so today is no more than a one-liner:

That only works if all elements are the same size. If you have a channel that contains things like Vec or Box<dyn Trait> then it makes sense to have a channel store "10 large ones or 100 small ones".

zeenix commented 2 years ago

I don't see how having the error available to Full complicates the API.

Would you still need the is_fatal method if the Rejected is always fatal and Full is never fatal?

Yes, because the CapacityLimiter does not return TrySendError, it returns Self::Error.

But what if it did? Once thing I do agree with @yoshuawuyts here is that this is likely adding more API than needed.

danieldg commented 2 years ago

Yes, because the CapacityLimiter does not return TrySendError, it returns Self::Error.

But what if it did? Once thing I do agree with @yoshuawuyts here is that this is likely adding more API than needed.

We could add a new enum for this (reusing TrySendError as-is wouldn't be pretty - you would need to request something like TrySendError<(), Self::Error> and make it an error to return Closed or Inactive from try_add) but that seems to be an increase in complexity.

yoshuawuyts commented 2 years ago

I think we need to tease apart two ideas here:

  1. How do we compute the size of an item?
  2. Does that computation need to live within async-broadcast?

While interesting, I don't think the exact details of how we compute sizes are important for this PR. Instead all we need to decide on is whether this logic should live inside async-broadcast or not. If in my example above we would replace the implementation of how_many with one that works in some way we deem correct, it doesn't fundamentally change anything about how async-broadcast should be used. This leads me to believe that that logic in fact should not exist within async-broadcast.

Perhaps my perspective on this is wrong though. But in order to accept this PR we should produce a case where we can clearly identify that "computing the size of items" requires existing within async-broadcast's API, and can in fact not be separate.

danieldg commented 2 years ago

I think we need to tease apart two ideas here:

1. How do we compute the size of an item?

2. Does that computation need to live within `async-broadcast`?

While interesting, I don't think the exact details of how we compute sizes are important for this PR. Instead all we need to decide on is whether this logic should live inside async-broadcast or not. If in my example above we would replace the implementation of how_many with one that works in some way we deem correct, it doesn't fundamentally change anything about how async-broadcast should be used. This leads me to believe that that logic in fact should not exist within async-broadcast.

The answer to both of these is "the CapacityLimiter trait". The computation does not live within async-broadcast except for the simplest case.

Perhaps my perspective on this is wrong though. But in order to accept this PR we should produce a case where we can clearly identify that "computing the size of items" requires existing within async-broadcast's API, and can in fact not be separate.

The SimpleMemoryCapacityLimiter struct does not technically need to live in this crate, but it's a useful example for users to refer to when creating their own.

zeenix commented 2 years ago

The answer to both of these is "the CapacityLimiter trait". The computation does not live within async-broadcast except for the simplest case.

How about this:

That way we keep the API introduced in the async-broadcast to minimum while still supporting plugable capacity limiters.

danieldg commented 2 years ago

Or just move the examples (because that's what they really are) to a sub-module so they don't clutter up the main struct list? It's not like they are very large or contribute much to compile time / binary size (if not used).

zeenix commented 2 years ago

It's not like they are very large or contribute much to compile time / binary size (if not used).

The issue is not compile-time but rather API surface. Also the same use case applies to async-channel since it also has capacity limiting and best way to share between them would be an external crate.

zeenix commented 2 years ago

@danieldg do you think you'll be working on this? If not, I'll close it.