Which Akka.NET Modules?
Cluster, Cluster Sharding, Distributed Data, Persistence, Streams
Describe the bug
Note that this might not be a bug but just an incorrect use of Akka on our end.
A node of ours got quarantined after seemingly losing an association to itself.
From the log of that node (which was using port 49794):
[akka://my-cluster/system/endpointManager] Association to [akka.tcp://my-cluster@mycluster.corporate.net:49794] with UID [1642503559] is irrecoverably failed. Quarantining address.
System.TimeoutException: Delivery of system messages timed out and they were dropped
[akka://my-cluster/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fmy-cluster%40corporate.net%3A49794-10] Removing receive buffers for [akka.tcp://my-cluster@mycluster.corporate.net:49794]->[akka.tcp://my-cluster@mycluster.corporate.net:49794]
It doesn't look like the node was especially busy before that (but there is that TimeoutException) and other nodes don't seem to have had issues with that node before either.
To Reproduce
Unfortunately we have currently no way to reproduce this issue.
Expected behavior
The association failure not to happen.
Actual behavior
The association failure occurred.
Screenshots
n/a
Environment
Windows, .NET 6.0.9
Additional context
As requested by @Aaronontheweb on discord the HOCON in effect has been attached:
if you tried to send a remote ActorSelection to yourself what happens? I don't know if we have a test case for that
that's easy to reproduce at least
if you wouldn't mind filing an issue and showing us a sanitized HOCON configuration
that would be very helpful
and we can look at reproducing the error as well
This is the HOCON of the failing node only, please let me know if you need the others as well.
Version Information Version of Akka.NET? 1.4.28
Which Akka.NET Modules? Cluster, Cluster Sharding, Distributed Data, Persistence, Streams
Describe the bug Note that this might not be a bug but just an incorrect use of Akka on our end. A node of ours got quarantined after seemingly losing an association to itself.
From the log of that node (which was using port 49794):
It doesn't look like the node was especially busy before that (but there is that TimeoutException) and other nodes don't seem to have had issues with that node before either.
To Reproduce Unfortunately we have currently no way to reproduce this issue.
Expected behavior The association failure not to happen.
Actual behavior The association failure occurred.
Screenshots n/a
Environment Windows, .NET 6.0.9
Additional context As requested by @Aaronontheweb on discord the HOCON in effect has been attached:
This is the HOCON of the failing node only, please let me know if you need the others as well.