scala / bug

Scala 2 bug reports only. Please, no questions — proper bug reports only.
https://scala-lang.org
232 stars 21 forks source link

Make Actors restartable #3470

Closed scabug closed 13 years ago

scabug commented 14 years ago

Actor cannot be restarted anymore (after finish, exit, kill, crash, ...) since at least 2.8.0.RC2. This used to work with 2.7.7 and 2.8.0.Beta1.

I think this is a critical regression, since it makes actor supervision and restart pretty much impossible.

Demo (works in REPL or script mode):

import scala.actors._

val a = new Actor { var c = 0; def act() = { c += 1; println("A: started: " + c) } }
a.start()
Thread.sleep(1000)
a.start()
Thread.sleep(1000)
a.start()

Output with 2.7.7.final and 2.8.0.Beta1-prerelease:

A: started: 1
A: started: 2
A: started: 3

Output with 2.8.0.RC2 and 2.8.0.r22016 (current snapshot):

A: started: 1

System information:

$$ java -version
java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)

$$ uname -a
Linux host 2.6.31-21-generic SI-59-Ubuntu SMP Wed Mar 24 07:28:27 UTC 2010 x86_64 GNU/Linux

$$ lsb_release -d -s
Ubuntu 9.10
scabug commented 14 years ago

Imported From: https://issues.scala-lang.org/browse/SI-3470?orig=1 Reporter: @oschulz

scabug commented 14 years ago

@paulp said: It took place between beta1 and rc1. Unfortunately that is a 3700 line diff for src/actors alone, so that's my whole contribution.

git diff remotes/tags/R_2_8_0_Beta1_prerelease..remotes/tags/R_2_8_0_RC1 -- src/actors/ |wc
    3682   16054  126170
scabug commented 14 years ago

@oschulz said: It happened in r21106:

trait Actor ... {
  ...
  override def start(): Actor = synchronized {
    ...

was changed to

trait Actor ... {
  ...
  override def start(): Actor = synchronized {
    if (_state == Actor.State.New) {
    ...

and similar changes happened in Reactor.

It's a huge commit. These particular change were done in the process of making start() idempotent, which is a good thing, of course. Still, actors should be restartable, so start should support the case _state == Actor.State.Terminated, too.

Question is, should the mesage queue be cleared on restart? How do Akka and Lift actors handle this on restart?

scabug commented 14 years ago

@phaller said: I don't see how start can be idempotent and support restarting a terminated actor at the same time. Restarting a terminated actor could be made the only exception to an otherwise idempotent start; however, I think that would make the semantics of start too complicated with little gain.

Also, the actor supervision and restart patterns known from Erlang are not affected by making start idempotent: "restarting" works by creating a new actor with the same behavior as the one that terminated. Note that Erlang actors cannot be restarted either.

In summary, making start idempotent was intentional. The goal has been to make the behavior of start robust (the behavior of start in 2.7.7 was a source of hard-to-debug errors, see http://article.gmane.org/gmane.comp.lang.scala.user/17995), and to simplify its semantics. Actually, the behavior of start in 2.7.x was never really specified; sure, you can test the implementation, but I do not consider the change to make start idempotent a regression. Therefore, I would suggest to close this ticket, unless there are compelling use cases that show that non-idempotent uses of start are very useful.

For your information, Jonas Bon�r recommended to make start idempotent, see http://article.gmane.org/gmane.comp.lang.scala.user/18002/. Therefore, I assume that's also the case in Akka, but I haven't verified. Lift actors do not have execution states like started and terminated, they are always passively responsive to messages, but do not have "active" behavior.

scabug commented 14 years ago

@oschulz said: I guess you're right, it's probably best if start stays fully idempotent, and does not restart a crashed Actor.

However (please correct me if I'm wrong here, I'm no Erlang expert) I think restart capability is still important in the Scala case. In Erlang, of course, you would "throw away" a crashed process and a supervisor would start a new one, registered under the same name (did I get this right?). But there is no registering of (non-remote) actors in Scala, so how would other actors find the replacement for the crashed one?

Restarting under the same process-id would not fit Erlang well, since Erlang is more "pure", no reassignment of variables, etc. But with Scala, I think, restarting would make a lot of sense, since the Actor reference would stay valid, and we don't need a registering mechanism. Not centrally registering actors has a lot of advantages, it makes unused Actors available for GC for example, and the VM will determine for us if an Actor is still in use (i.e. referenced) or not.

So using the Actor's reference as the primary "key" to locate and access it should fit Scala very well. As far as Jonas (Akka's) Actors are concerned, they are certainly restartable (they even provide pre-/post-restart callbacks), from what I've seen. I guess the new ActorID's in Akka stay valid after a restart, too, but I haven't verified this.

Or did I get this wrong, is there already a way to locate a replacement actor in Scala? In that case, we can close this. If not, I'd plead for adding a simple Actor.restart(), maybe throwing an exception if the Actor is not in terminated state to make things safe.

I'll turn this from a regression into an urgent feature request, since my app currently hangs after an actor crashes. :-)

scabug commented 14 years ago

@phaller said: Thanks for your comments. I understand that finding replacement actors is problematic without a name registration mechanism like in Erlang. You're right, there is no such mechanism for non-remote actors. I also agree that using references as the primary way to locate actors fits the model of scala.actors very well, for the reasons you mention.

Adding a restart method seems to be a reasonable way to address this.

scabug commented 14 years ago

@phaller said: (In r22029) Addresses see #3470 by adding a method Reactor.restart. Review by rompf.

scabug commented 14 years ago

@oschulz said: Thanks a lot, Philipp!

Uhm - I'm almost afraid to ask, 'cause I know you're all really busy - any chance this will make it into RC3? I'm getting greedy now, I know ... ;-)

scabug commented 14 years ago

@phaller said: Yes, the change set is included in RC3.

scabug commented 14 years ago

@oschulz said: This seems to be fixed. :-)