asynkron / protoactor-go

Proto Actor - Ultra fast distributed actors for Go, C# and Java/Kotlin
http://proto.actor
Apache License 2.0
5.05k stars 523 forks source link

It might be better to remove `actor.stopping` from the application layer #214

Open alex023 opened 6 years ago

alex023 commented 6 years ago

Description:

If Actor appears panic when it responds to the actor.Stopping message, it goes into restarting and continues to survive. There are data anomalies, memory spillovers or other hidden dangers. Maybe we should only keep the *actor.Stopped,and remove *actor.Stopping from application layer.

TestCode:

package main

import (
    "fmt"
    "time"
    "github.com/AsynkronIT/protoactor-go/actor"
    "reflect"
    "github.com/AsynkronIT/goconsole"
    "context"
)

//message for actor
type MessageHello struct{}
type MessageCaculate struct {
    x int
    y int
}

//actor definition
type TestActor struct {
    count      int //if panic it will not reset
    cancelFunc context.CancelFunc
    cancelCtx  context.Context
}

func (pa *TestActor) Receive(ctx actor.Context) {
    pa.count++
    msgType := reflect.TypeOf(ctx.Message())
    fmt.Printf("%v: receive message[%v] \n", pa.count, msgType)

    switch msg := ctx.Message().(type) {
    case *actor.Started:
        fmt.Println("   Started, initialize actor here")
        pa.cancelCtx, pa.cancelFunc = context.WithCancel(context.Background())
        go func() {
            t := time.NewTicker(time.Second * 5)
            for {
                select {
                case <-t.C:
                    ctx.Self().Tell(MessageHello{})
                case <-pa.cancelCtx.Done():
                    goto Loop

                }
            }
        Loop:
        }()
    case *actor.Stopping:
        fmt.Println("   Stopping, actor is about shut down")
        time.Sleep(time.Second)
        pa.cancelFunc()
        panic("")
    case *actor.Stopped:
        fmt.Println("   Stopped, actor and it's children are stopped")
    case *actor.Restarting:
        fmt.Println("   Restarting, actor is about restart")
    case MessageCaculate:
        fmt.Printf("   caculate,  %v multi %v= %v \n",msg.x,msg.y,msg.x*msg.y)
    case MessageHello:
        println("I am Survive!")
    }
}

func main() {
    props := actor.FromProducer(func() actor.Actor {
        return &TestActor{count:0}
    })
    actor := actor.Spawn(props)

    actor.Tell(MessageCaculate{2, 1})
    time.Sleep(1 * time.Second)

    fmt.Println(">  ready send stop message, please watching the actor still survive! ")
    time.Sleep(1 * time.Second)
    actor.Stop()

    //now we can watching actor report
    console.ReadLine()
}
rogeralsing commented 6 years ago

Nice catch. very tricky edge case. Maybe stopping should even have it's own recovery handling to prevent this.

Let's brainstorm around it

bryanpkc commented 6 years ago

Both *Stopping and *Stopped have this problem. These life cycle messages are essential to the design of Proto.Actor so I would try to avoid deleting them. It seems clear that a stopping actor that fails should not be restarted and potentially prevent an entire tree of actors from stopping. I can think of two reasonable approaches:

  1. The stopping actor should simply not escalate the failure; EscalateFailure can return early if ctx.state >= stateStopping. The defer call in mailbox.go will still trigger an error message to be logged, so the failure is not completely invisible to the user.

  2. The supervisor should not try to resume or restart an actor that has failed while stopping. This approach puts the responsibility to do the right thing in the SupervisionStrategy, which is where it belongs. To allow the SupervisionStrategy to make the right decision, EscalateFailure could pass along the child's state in the *Failure message.

rogeralsing commented 6 years ago

There are some major changes in the dev branchin for the C# API right now. We have even talked about redesigning how supervision works, but nothing final there yet.

So if we want to do some change to the Go API, we should try to align this right now.