UninitializedPropertyAccessException can occur in EndpointWriter restarting() method causing infinite loop

asynkron / protoactor-kotlin

Ultra-fast distributed cross-platform actor framework

http://proto.actor

Apache License 2.0

221 stars 25 forks source link

UninitializedPropertyAccessException can occur in EndpointWriter restarting() method causing infinite loop #30

Open james-cobb opened 6 years ago

james-cobb commented 6 years ago

In tests deliberately causing gRPC endpoints to fail, we found the EndpointWriter on the surviving node can get into an infinite loop.

The EndpointWriter is sent a Restarting message. In the restarting() method channel.shutdownNow() is called, but the lateinit channel is not yet initialized. This causes an exception in the restarting() method, which then causes the supervisor to force another restart, causing a loop using 100% CPU.

https://github.com/AsynkronIT/protoactor-kotlin/blob/2234df6fdc5cf4175624ec3f1632de72f718bcc0/proto-remote/src/main/kotlin/actor/proto/remote/EndpointWriter.kt#L67

james-cobb commented 6 years ago

I think this may be caused if an exception is thrown in started() method. Then restarting() is called with an EndpointWriter that isn't correctly set up. I think the exception in started() could be caused by sending a message to a pid with empty address. I think this then could cause the infinite loop observed.

james-cobb commented 6 years ago

I have confirmed that an empty pid was causing this issue.

Question remains on the correct behaviour when a non existent remote pid is used when sending a message. An EndpointWriter will be created that currently throws an exception in Started, causing an infinite retry loop.

I think the answer might be a supervision strategy that only allows for a limited number of Restarts. Is that implemented anywhere else?

james-cobb commented 6 years ago

Simple demonstration of the retry loop:

        Remote.start("localhost" , 1234)
        send(PID.newBuilder().build(), "")
        readLine()