Call! consumes others' exit messages

DimaSalakhov commented 8 years ago

If actor1 is call!-ing a Server and observable acotr2 exits, then bad things happen.

Scenario1: Actor1 watches Actor2 and traps lifetime events. Actor1 is calling Server1 and before Server1 responds, Actor2 exits. In that case Actor1 will never know about Actor2's death, this message will be thrown away inside call's SelectiveReceiveHelper.

Scenario2: Actor1 links Actor2 and traps lifetime events. Actor1 is calling Server1 and before Server1 responds, Actor2 exits. In that case unhandled exception will leak outside and kill all humans. Here is a stack trace:

Exception in Fiber "fiber-10000007" co.paralleluniverse.actors.LifecycleException: ExitMessage{actor: ActorRef@5e007c49{PulsarActor@6e191ac8a713[owner: fiber-10000009]}, cause: null}
at co.paralleluniverse.actors.Actor.handleLifecycleMessage(Actor.java:755)
    at co.paralleluniverse.actors.PulsarActor.handleLifecycleMessage(PulsarActor.java:121)
    at co.paralleluniverse.actors.SelectiveReceiveHelper.handleLifecycleMessage(SelectiveReceiveHelper.java:298)
    at co.paralleluniverse.actors.behaviors.RequestReplyHelper$1.handleLifecycleMessage(RequestReplyHelper.java:169)
    at co.paralleluniverse.actors.SelectiveReceiveHelper.receive(SelectiveReceiveHelper.java:121)
    at co.paralleluniverse.actors.behaviors.RequestReplyHelper.call(RequestReplyHelper.java:174)
    at co.paralleluniverse.actors.behaviors.Server.call(Server.java:102)
    at co.paralleluniverse.actors.behaviors.Server.call(Server.java:80)
    at co.paralleluniverse.pulsar.actors$call_BANG_.invoke(actors.clj:669)

Here is a small working example:

(ns failzzz
  (require [co.paralleluniverse.pulsar.actors :refer :all]
           [clojure.core.match :refer [match]])
  (:import (co.paralleluniverse.strands Strand)))

(defn dodgy-server []
  (reify Server
    (init [this])
    (handle-call [this from id message]
      (match message
             [:ping actor2] (do (! actor2 [:headshot])
                                (Strand/sleep 5000)
                                "pong")))
    (terminate [this cause]
      (println "Terminating server"))))

(defn spawn-actor1 []
  (let [srv (spawn (gen-server (dodgy-server)))
        actor2 (spawn
                 (fn []
                   (receive m (println "I'm done here, exit"))))]
    (spawn :trap true
           (fn []
             (link! actor2)
             (loop []
               (let [m (receive)]
                 (match m
                        [:exit _ a reason] (println "Received exit")
                        m (do
                            (println (str "received message" m))
                            (println (str "called result "
                                          (call! srv [:ping actor2]))))))
               (recur))))))

(defn kaboom []
  (let [actor1 (spawn-actor1)]
    (! actor1 [:start])))

circlespainter commented 8 years ago

We'll uniform Quasar's selective receive (and call, which is a special case) to that of Erlang/OTP and Elixir which, according to my tests, delay lifecycle messages until the request/response exchange is complete. This should also eliminate these problematic behaviors.

DimaSalakhov commented 8 years ago

Sounds reasonable! :+1:

circlespainter commented 8 years ago

A more thorough summary of Erlang/OTP behavior during call w.r.t. monitor/link: https://github.com/circlespainter/gen_server-experiments

circlespainter commented 8 years ago

I tried again with the latest Quasar 0.7.6-SNAPSHOT build which includes the fix for https://github.com/puniverse/quasar/issues/187 (available on SonaType): the exit message from the watch seems now to be received correctly when calling receive after call, while in the link case the exception is still thrown as that's what Erlang/OTP does as well.

Closed by https://github.com/puniverse/quasar/commit/62d83b541bbdaeedf5c8d2ff431a6be7285e1363.

DimaSalakhov commented 8 years ago

Both watch! and link! produce exit messages, when I trap the actor. So will that behaviour still be true in the described case or link! will throw me an exception instead?

circlespainter commented 8 years ago

Sorry, I had missed the "trap" part in your description. So now all cases you list will produce an exit message, instead if you link! and do not :trap you will get an exception while in call!, which is consistent with what Erlang/OTP does as well.

DimaSalakhov commented 8 years ago

beauty :fireworks: Thanks Fabio!

puniverse / pulsar

Call! consumes others' exit messages #58