kquick / Thespian

Python Actor concurrency library
MIT License
189 stars 24 forks source link

Actors won´t exit #58

Closed rbotafogo closed 3 years ago

rbotafogo commented 4 years ago

Hello....

I'm working on Windows 10 with OpenCV and Thespian to read multiple real time video feeds. Since videos come at 30 FPS I have an Actor per camera capturing the videos. The captured frame is stored on a memory mapped file for processing by other actors. The Actor capturing frames (VideoDecoder) starts a helper Actor (DrumBeat) that beats every 30ms notifying the actor that it should capture a new frame. DrumBeat uses wakeupAfter to beat at the 30ms rhythm. So far, so good...

When I shutdown the system, I see the exitActor message being propagated from the system to all the created Actors. However, VideoDecoder never gets the exitActor message. Seems to me that the wakeup messages from DrumBeat don´t let VideoDecoder receive any other messages. So, VideoDecoder never exists and the same for DrumBeat.

Any ideas on how to exit those processes? Am I doing something wrong?

Thanks

kquick commented 4 years ago

I don't think you are necessarily doing anything wrong at this point.

How long does it take the VideoDecoder to do it's work each time it is awoken by the DrumBeat? If it takes the VideoDecoder almost or more than 30ms to do its work then messages (from DrumBeat) may be being queued up enough that it takes a long time for the VideoDecoder to work through the backlog and process the exit request. One way you could test this would be to change the DrumBeat wakeup time from 30ms to something large like 5s; this would obviously have bad effects on your image capture but if the actors exit cleanly in this mode that would support the hypothesis of queueing to reveal a root cause.

Also you referred to an "exitActor message", but the Thespian shutdown sends an ActorExitMessage. There is special handling for ActorExitMessage to ensure that it causes a shutdown even if it is ignored, but if you have a different message for requesting shutdown and the VideoDecoder is ignoring that message (eg. you are using ActorTypeDispatcher and haven't added a receiveMsg_exitMessage method) then Thespian won't know it's an actor shutdown request so it won't enforce a shutdown.

An Actor generally doesn't shutdown until its children shutdown, so it could also be that VideoEncoder is seeing the message but DrumBeat is dropping it (e.g. possibly as described above) and therefore not exiting. Do you use self.wakeupAfter() in DrumBeat or is it doing something blocking like time.sleep()? If it's blocking then it will never process incoming messages (even an ActorExitRequest) and therefore never shutdown, blocking VideoDecoder from shutting down as well.

kquick commented 4 years ago

Any updates on this, @rbotafogo ?

rbotafogo commented 4 years ago

Hi... sorry about not giving you any return. I didn´t have much time to look at the issue, but I had a bug that put my code in a busy loop. Fixing that improved the actor´s shutdown, but sometimes they still persist. The problem might actually be in my code and not with Thespian. Speed is critical in this application, as each frame needs to be processed in 30ms. So, Actors communicate through message passing but also through a memory mapped file. I´m considering the possibility that processes get in deadlock once some of them terminate and do not properly update the memory mapped file. If you like, I could close the issue while I investigate further, or I could leave it open until I find the problem.

Other than that, the code is working fine and did speed up development quite a bit.

Thanks

Em qua., 11 de mar. de 2020 às 03:03, Kevin Quick notifications@github.com escreveu:

Any updates on this, @rbotafogo https://github.com/rbotafogo ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kquick/Thespian/issues/58?email_source=notifications&email_token=AA6QP4PMCYTELV6MHLLDKPDRG4SURA5CNFSM4KYUO5T2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOOIDHQ#issuecomment-597459358, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6QP4JGMUGGFE5DLDTYMT3RG4SURANCNFSM4KYUO5TQ .

-- Rodrigo Botafogo

kquick commented 4 years ago

Thanks for the update, @rbotafogo. You can leave the issue open if you'd like, it's up to you. If it's open, I'll probably ping you periodically.

kquick commented 3 years ago

Hi @rbotafogo , did you resolve everything to your satisfaction, or are there still concerns that should be addressed here?

rbotafogo commented 3 years ago

Hello Kevin,

Yes. All resolved. Sorry that I forgot to close this issue.

Thanks/

Em seg., 10 de mai. de 2021 às 02:57, Kevin Quick @.***> escreveu:

Hi @rbotafogo https://github.com/rbotafogo , did you resolve everything to your satisfaction, or are there still concerns that should be addressed here?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kquick/Thespian/issues/58#issuecomment-836216963, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6QP4PQPLUMGRZTADIAS7LTM5YUVANCNFSM4KYUO5TQ .

-- Rodrigo Botafogo

kquick commented 3 years ago

OK, sounds good.