akkadotnet / akka.net

Canonical actor model implementation for .NET with local + distributed actors in C# and F#.
http://getakka.net
Other
4.7k stars 1.04k forks source link

Connection to Remote actor system didn't recovered after restart #3256

Closed YaroslavBrichko closed 6 years ago

YaroslavBrichko commented 6 years ago

-Akka.net Akka.Remote version 1.3.2 -.Net 4.5.2 To Reproduce: 1.Load Remote Actor system 2.Load client Actor system 3.Make sure we have connection between systems : send message.Message successfully transferred. 4.Stop remote actor system 5.Start remote actor system 5.Try to send message to remote actor system: the message get marked as dead letter on server (remote) actor. P.C.It doesn't happen on version 1.0.4

My overhead: Receive<Terminated>(term => { Context.System.Terminate(); });

Horusiath commented 6 years ago

How are you sending your messages? Who is the message initiator (client or server)? Which actor system are you restarting? Could you provide a client/server sample?

YaroslavBrichko commented 6 years ago

ClientActor.Tell(new MyMessage()) ServerActor.Receive<MyMessage>(m=>{}). Works fine. Server actor is console application. I make it shutdown and after that run again. Client actor got 'Terminade' message. Again ClientActor.Tell(new MyMessage() On Server actor i see: [akka://ImportTaskQueueListener/user/ImportTaskQueueController/Dispatcher/Queue] Message MyMessage from akka.tcp://ImportQueueActorSystemClient@localhost:55498/user/ImportQueueRequestController/SenderActor_5614 to akka://ImportTaskQueueListener/user/ImportTaskQueueController/Dispatcher/Queue was not delivered. 1 dead letters encountered.

YaroslavBrichko commented 6 years ago

I took the example from Example. It's work fine in version 1.0.4. I'm updated it t latest version 1.3.2. It doesn't work. After stop and restart Server process client has no association with server actor system.

Danthar commented 6 years ago

@YaroslavBrichko unfortunately that link does not help us. Linking a payed pluralsight course and expecting us to be able to get the example code from there. Is unrealistic. Please provide a gist or github repo where we can view the code.

Untill then, the best we can do is guess.

YaroslavBrichko commented 6 years ago

Here you are Compile it with akka+akka.remoting 1.0.4. Start the server.Start the client. Works fine. Stop the server process.Start it again. Client sends message and server receives it. Great. Update akka + akka.remote to latest 1.3.2 version. Start the server,start the client. Works fine.Stop the server process.Run it again. Try to send message from the client. You'll see,that the connection wasn't recovered. All messages the client sent went to deadletters. Thank

Danthar commented 6 years ago

Thx, ill take a look at this tomorrow

Danthar commented 6 years ago

Ok. So i reproduced this. And its nothing all that surprising.

The difference in the remoting code between 1.0.4 and 1.3.2 is huge. You can't really compare the 2. The difference between those 2 versions are a TON of hardening and bugfixes. Thats not even mentioning the switch from Helios to DotNetty (although helios was originally based on DotNetty)

And what you have run into here, is a bug in the 1.0.4 code. It was never supposed to work that way. Read up on Associations: http://getakka.net/articles/remoting/index.html

The bug in 1.0.4 was that when reconnecting to a new instance of an remote actorsystem, Any old associations should be marked as invalid. This wasn't happening in 1.0.4. And was fixed later on.

In your example. The reason is does not work (after updating). Is because you reuse the same IActorRef. Thus you keep trying to use the old association. If you want to fix this, you can do 2 things:

YaroslavBrichko commented 6 years ago

Thanks a lot. I'm considering the first option. I think use by IActorRef direct is more preferred,isn't?

Danthar commented 6 years ago

If you are doing real production stuff. Then yes. Monitoring your remote actorref, and retry-ing upon termination, is the best way to go. In terms of performance.

YaroslavBrichko commented 6 years ago

Thank you very much. I close the issue