Open bastl98 opened 1 week ago
Hello, thanks for using RabbitMQ and this library.
So we don't have to guess, could you please provide a git repository with code to reproduce this issue? If one of the example projects will work, please let us know.
Have you tried this example code? https://github.com/rabbitmq/rabbitmq-stream-dotnet-client/tree/main/docs/ReliableClient
I have adapted the BestPracticesClient for reproduction purposes (Logging in Consumer Callback and lowered hearbeat timespan). I have attatched the BestPracticesClient and appsettings.json which i have used to reproduce the error.
Here´s the link to the repo: https://github.com/bastl98/rmq-bug-source.git
Steps:
@bastl98 Thank you for reporting the issue.
The library is working properly. It is precisely the scope for the heartbeat to close the client when the client does not receive the "alive" from the server.
The problem here is the Consumer with the status IsOpen == true
is even closed and should be set as closed.
The correct status should be IsOpen == false
*EDIT See: https://github.com/rabbitmq/rabbitmq-stream-dotnet-client/issues/393#issuecomment-2419890934
Will the consumer handling in this case be fixed in the foreseeable future?
Will the consumer handling in this case be fixed in the foreseeable future?
What do you mean? In this case, the consumer will be closed.
*EDIT See: https://github.com/rabbitmq/rabbitmq-stream-dotnet-client/issues/393#issuecomment-2419890934
@bastl98 Ok you were right. The heartbeat should be considered as Unexpected
close so the client should try to reconnect.
@Gsantomaggio unfortunatly, the error is not resolved. I think the problem now is in the Dispose method of the connection.
When the heartbeats are missed, the Close method of the client is called, this method sets the close status of the connection to unexpected.
But in the Dispose Method of the connection, which is called in any case at the of the Close method, the close reason is set to normal again.
Describe the bug
Currently, the stream connection is not resilient when too many heartbeats are missed and there is a timeout on the connection close. All publishes of this connection result in a client timeout error and all consumers of this connection stop working.
There is also no reconnection attempt, because this error is processed as a "normal" connection close. A manual reconnection attempt is also not possible, because all properties which could be used to check if there is a need for a reconnection attempt still indicate that the connection is available:
Reproduction steps
1.Start RMQ Cluster with 3 nodes (docker) 2.Create a stream system and a consumer with the client lib which connects to one of the nodes (lower heartbeat interval for testing) 3.Pause the the node to which the consumer is connected 4.Wait for 4 Heartbeats
Expected behavior
The timeout error should not result in a "normal" connection close, this should lead to a reconnection attempt by the library itself.
Additional context
Our RMQ Cluster has version 3.13.0 and our client lib has version 1.8.2