eclipse-iceoryx / iceoryx

Eclipse iceoryx™ - true zero-copy inter-process-communication
https://iceoryx.io
Apache License 2.0
1.66k stars 388 forks source link

Backpressure / blocking /queue full indication on publisher / subscriber #1945

Open niclar opened 1 year ago

niclar commented 1 year ago

Hi, what's the recommended way of detecting that a blocking (WAIT_FOR_CONSUMER/BLOCK_PRODUCER) publisher/subscriber pair is subject for backpressure / blocking (at full subscription queue). Would be nice to have a helper function for it and not have to resort to application layer stopwatch timers

Best,

elfenpiff commented 1 year ago

@niclar We do not yet have this feature on our roadmap. How would you like to use it? Do you already have some API in mind?

The current approach would be to use the non-blocking feature and then you can use the subscriber method hasMissedData(). If this is an error case for you, you could increase the queue size of the subscriber and emit a fatal error whenever you miss data.

niclar commented 1 year ago

I guess just a function to return the current queue allocation/occupancy would suffice. That would also facilitate detection of queue build up, so that's nice. Is that something that falls into the current design ?

Idea being to monitor these queues, alert and maybe add mitigation for slow consumption when they drift from a somewhat steady state..

-On a somewhat related note, do you know if the blocking protocol adds a lot of latency (when there's no backpressure) ?

elfenpiff commented 1 year ago

I guess just a function to return the current queue allocation/occupancy would suffice. That would also facilitate detection of queue build up, so that's nice. Is that something that falls into the current design ?

In the end the old data is not returned back to the high level publisher but just recycled. But with the subscriber you can detect it.

Idea being to monitor these queues, alert and maybe add mitigation for slow consumption when they drift from a somewhat steady state..

Do you understand you correctly and you would like to detect this on publisher side as well?

-On a somewhat related note, do you know if the blocking protocol adds a lot of latency (when there's no backpressure) ?

Without backpressure the blocking behavior should have no overhead at all.

niclar commented 1 year ago

"Do you understand you correctly and you would like to detect this on publisher side as well?" It's sufficient to detect this on the subscriber side.

"But with the subscriber you can detect it." -Do you have any pointers on how to do that for the blocking version ?

elfenpiff commented 1 year ago

@niclar

"But with the subscriber you can detect it." -Do you have any pointers on how to do that for the blocking version ?

This is not detectable for the blocking version. When this is a fatal failure for you, you could also go for the non-blocking version, check with every receive if there was an overflow and if so terminate.

niclar commented 1 year ago

Yeah.. terminating in these circumstances is quite bad in our production context. With the blocking we at least get an early indication upstream (publications start taking more time, or some disconnecting) or symptoms rather, that throughput is hampered, and we can still make it to shore.

elfenpiff commented 1 year ago

@niclar In the end you require just some additional methods in the subscriber like

uint64_t bufferSize() const;
uint64_t bufferCapacity() const;

which return to you the total bufferCapacity and the current fill state with bufferSize?

niclar commented 1 year ago

@elfenpiff, yes that will suffice.