Closed roman-huliak closed 1 year ago
@roman-huliak Thank you for filing this, do you think there is a way to test this?
Sorry, i can't (havent time for this). Modification what i've made is working (for me) and i'm using it on many production deployments. so far so good.
Could somebody please give a bit more context what this change fixes? I don’t have the best understanding of the protocol, and the client event loop’s inner workings.
As I understand it, this tries to add heartbeat events to the communication while something time-consuming is going on (download of huge message from queue).
But doesn’t any traffic on the connection count as heartbeat events? See e.g. here: Heartbeat Frames, RabbitMQ
But doesn’t any traffic on the connection count as heartbeat events? See e.g. here: Heartbeat Frames, RabbitMQ
After sifting through the code a while this seems to me it makes the heartbeat more regular, correct @roman-huliak? And as per the RabbitMQ heartbeats documentation that is perfectly fine. I'm not too worried about making heartbeats regular as it's a tiny frame to toss over the wire.
Thanks. I agree, it won’t hurt to have this feature. We might be able to test this by reading heartbeat logs from a sufficiently verbose RabbitMQ. I’ll try it out when I find some time. For what it’s worth, I’ll also try to run an internal queuing test suite with the changes from this PR. Just to see what that does.
@mermshaus Let us know how that goes :+1: . I'l continue prepping breaking changes for 0.6.0 and won't merge those until we make sure this won't mess up in some unexpected ways no of us 3 has forseen.
This is a thumbs-up regarding our test suite. :+1:
The smallest heartbeat
setting I used was 0.025 seconds. Although that doesn’t really had much impact because one property of our tests is that nearly all time is spent in actual job execution. Our queued messages are rather small. So all our logs look like this:
(I was able to log the heartbeats using the heartbeat_callback
setting in Bunny.)
Job started
...some seconds pass...
Job finished
Heartbeat <-- one (zero?) or more
Job started
...some seconds pass...
Job finished
Heartbeat <-- one (zero?) or more
Heartbeat <--
Job started
...
Job execution (the consumer function) is blocking the main Bunny loop.
Nevertheless, I assume that during my tests there was enough activity on the queue to have at least some of the heartbeats take place during message retrieval. I think this assumption is supported by the parts of the log in which multiple heartbeats happen between two jobs. What else would the client do during this interval than retrieving another message?
I also can confirm that multiple heartbeats outside of the job execution happen within (or close to) the provided heartbeat
interval:
{"level":"INFO","message":"Heartbeat callback","context":[],"metadata":{"timestamp":1670775011.97914, ...
{"level":"INFO","message":"Heartbeat callback","context":[],"metadata":{"timestamp":1670775012.004494, ...
{"level":"INFO","message":"Heartbeat callback","context":[],"metadata":{"timestamp":1670775012.029864, ...
{"level":"INFO","message":"Heartbeat callback","context":[],"metadata":{"timestamp":1670775012.055178, ...
{"level":"INFO","message":"Heartbeat callback","context":[],"metadata":{"timestamp":1670775012.080635, ...
{"level":"INFO","message":"Heartbeat callback","context":[],"metadata":{"timestamp":1670775012.105936, ...
{"level":"INFO","message":"Heartbeat callback","context":[],"metadata":{"timestamp":1670775012.131322, ...
{"level":"INFO","message":"Heartbeat callback","context":[],"metadata":{"timestamp":1670775012.156669, ...
We have ~160 tests using Bunny/the message queue. Most of them spawn 10 clients (workers). Most of the tests queue just a couple of messages. Our biggest test appends 500+. We are using these qos settings:
$this->channel->qos(
0, // Prefetch size
1, // Prefetch count
);
Released as part of v0.5.3
HB is not called if there is active communication from queue