amqp / rhea

A reactive messaging library based on the AMQP protocol
Apache License 2.0
273 stars 80 forks source link

idle_time_out fires which event? #397

Closed iam-baju closed 1 year ago

iam-baju commented 1 year ago

Hello @grs, I have set idle_time_out in my connection options. what does this signify and which connection event is fired once this limit is crossed?

image

Also, is there an upper limit to this idle_time_out and is there a way we can keep the AMQP connections alive forever? Thank you

grs commented 1 year ago

The idle time out is a signal to the peer (i.e. the process on the other side of the socket), that you want them to send some traffic to indicate the connection is working at least once in the specified time. If they don't the connection will be consider as failed and closed explicitly.

There is no limit in AMQP or in rhea as to how long a connection can be kept open. However it also depends on the peer and on the various routers etc between you and the peer.

iam-baju commented 1 year ago

Currently, If I'm not setting any idle_time_out in the connection options -
a disconnected event is fired every 5 mins. Any particular reason for this?

Although we're reconnecting to the connection after it, but is there a way we can completely avoid this?

grs commented 1 year ago

Currently, If I'm not setting any idle_time_out in the connection options - a disconnected event is fired every 5 mins. Any particular reason for this?

Probably a question for the broker/service you are connecting to. It sounds like they are closing the connection in that case.

Although we're reconnecting to the connection after it, but is there a way we can completely avoid this?

What happens if you do set an idle timeout?

iam-baju commented 1 year ago

If we set an idle_time_out for less than 5 mins it's firing connection_close event and connection is closed (which I assume is the expected behaviour). However for any idle_time_out above 5 mins out connection is getting disconnected on exactly 5 mins mark (we have reconnection mechanism at place though, but we'd like to avoid this scenario)

grs commented 1 year ago

What service/broker are you connecting to?

iam-baju commented 1 year ago

It's an internal service called - Message-queuing (Message Broker as a Service @ SAP). It's based on Solace PubSub

grs commented 1 year ago

I would ask the support for that service about it. They may be able to explain why you see what you see.

jmargieh commented 1 year ago

@grs, Is there an option to tell rhea to send an empty frame to the message broker? I think this is the missing part..

This library is provided by SAP: https://www.npmjs.com/package/@sap/xb-msg-amqp-v100?activeTab=readme#idle-timeout

idleTimeoutTryKeepAlive: defines the timeout behavior, indicates whether to send an empty frame to keep the connection alive or to end the connection, sending a close frame with an appropriate error message.

grs commented 1 year ago

Is there an option to tell rhea to send an empty frame to the message broker?

rhea will do that automatically if the broker requests it; you can verify that with a protocol trace (e.g. wireshark or increasing logging on one side or the other)

iam-baju commented 1 year ago

rhea will do that automatically if the broker requests it; you can verify that with a protocol trace (e.g. wireshark or increasing logging on one side or the other)

@grs, what do you mean by "if the broker requests it", can't we send the empty frame from our end and keep the connection alive? Something like this : [@sap/xb-msg-amqp-v100](https://www.npmjs.com/package/@sap/xb-msg-amqp-v100#:~:text=to%20net.setTimeout()-,idleTimeoutTryKeepAlive,-%3A%20defines%20the%20timeout)

grs commented 1 year ago

The way AMQP works is that each side of the connection can separately inform the other the interval after which, if they receive nothing, they will close the connection. So if the broker is going to close the connection after x secs, it should indicate that to the client and the client will ensure it sends heartbeats more frequently than that. I.e. you don't configure the client to send heartbeats, the broker does.

(Likewise you can configure the interval after which the client will close the connection is it has not received anything. This will be sent to the broker and the broker should - if it wants to keep the connection open - send heartbeats to prevent the connection being idle for that configured period, or else the client would close the connection).

jmargieh commented 1 year ago

@grs , when setting idle_time_out we get closed event followed by disconnected event. disconnected event may occur for many other reasons. We are building a library on top of rhea, we listen to the disconnected event and act accordingly. We haven't found a way to discard disconnected events due to idle timeout. do you have any suggestions?

grs commented 1 year ago

when setting idle_time_out we get closed event followed by disconnected event

That sounds like your broker is failing to send heartbeats.

disconnected event may occur for many other reasons. We are building a library on top of rhea, we listen to the disconnected event and act accordingly. We haven't found a way to discard disconnected events due to idle timeout. do you have any suggestions?

If the broker does not send heartbeats, rhea will send a connection-close to the broker with a condition of amqp:resource-limit-exceeded and a description 'max idle time exceeded'. If the connection is in fact dead, the broker likely would not receive that. However if you are getting a connection_close event, it means the broker gets the close and send its own close back. What the condition or description is when it does so will be dependent on the broker. You can see what it sends aaand that might be of use to you (but would be broker dependent behaviour).

The disconnect event is raised when the socket itself is closed. There is at present no way to distinguish different reasons for that close.

jmargieh commented 1 year ago

Thanks for the explanation, on close event we do not get anything on the context, however on disconneded event we get context.error:

{"code":"ERR_STREAM_WRITE_AFTER_END"}

What you're saying is that we are geting this from the broker?

grs commented 1 year ago

No, the disconnected event is a result of the socket closing. The closed event is, I believe, a result of the broker sending a close.