voryx / Thruway

PHP Client and Router Library for Autobahn and WAMP (Web Application Messaging Protocol) for Real-Time Application Messaging
MIT License
674 stars 117 forks source link

PHP Fatal Error #199

Closed inovamatic closed 8 years ago

inovamatic commented 8 years ago

Hi,

In case of a php fatal error is there any way to force crossbar service to restart instead of just the worker exit? Alternative a way of detecting the worker exit to force service restart?

Thanks

RafaelKa commented 8 years ago

see supevisord or https://github.com/jamset/process-load-manager

inovamatic commented 8 years ago

Thanks RafaelKa,

I solved it with monit adding the following line:

if children < 3 then restart

3 is the number os child processes that crossbar-controler has on my environment. If any of them exits then crossbar restarts itself and notifies me. You have to check the number of child processes you have running using:

sudo service crossabar status.

Thanks

oberstet commented 8 years ago

None of above is necessary nor recommended.

In standalone mode, Crossbar.io will shutdown the whole node when one originally started worker exits. Eg here is what happens if you kill one of the workers hard:

(cpy2711_1) oberstet@office-corei7:~$ crossbar version
     __  __  __  __  __  __      __     __
    /  `|__)/  \/__`/__`|__) /\ |__)  |/  \
    \__,|  \\__/.__/.__/|__)/~~\|  \. |\__/

 Crossbar.io        : 0.14.0
   Autobahn         : 0.14.1 (with JSON, MessagePack, CBOR, UBJSON)
   Twisted          : 16.2.0-EPollReactor
   LMDB             : 0.89/lmdb-0.9.18
   Python           : 2.7.11/CPython
 OS                 : Linux-4.4.0-22-generic-x86_64-with-debian-stretch-sid
 Machine            : x86_64

(cpy2711_1) oberstet@office-corei7:~$ mkdir node1
(cpy2711_1) oberstet@office-corei7:~$ cd node1
(cpy2711_1) oberstet@office-corei7:~/node1$ crossbar init --template hello:python
Initializing application template 'hello:python' in directory '/home/oberstet/node1'
Using template from '/home/oberstet/cpy2711_1/lib/python2.7/site-packages/crossbar/templates/hello/python'
Creating directory /home/oberstet/node1/web
Creating directory /home/oberstet/node1/.crossbar
Creating file /home/oberstet/node1/README.md
Creating file /home/oberstet/node1/hello.py
Creating file /home/oberstet/node1/web/index.html
Creating file /home/oberstet/node1/.crossbar/config.json
Application template initialized

To start your node, run 'crossbar start --cbdir /home/oberstet/node1/.crossbar'

(cpy2711_1) oberstet@office-corei7:~/node1$ crossbar start
2016-06-04T12:28:04+0200 [Controller   3042] New node key generated!
2016-06-04T12:28:04+0200 [Controller   3042]      __  __  __  __  __  __      __     __
2016-06-04T12:28:04+0200 [Controller   3042]     /  `|__)/  \/__`/__`|__) /\ |__)  |/  \
2016-06-04T12:28:04+0200 [Controller   3042]     \__,|  \\__/.__/.__/|__)/~~\|  \. |\__/
2016-06-04T12:28:04+0200 [Controller   3042]                                         
2016-06-04T12:28:04+0200 [Controller   3042]     Crossbar.io Version: 0.14.0
2016-06-04T12:28:04+0200 [Controller   3042]     Node Public Key: de0a91c307327ce307b0efacad583c895c732ffbf82e7a94e54f4b64b65a15d1
2016-06-04T12:28:04+0200 [Controller   3042] 
2016-06-04T12:28:04+0200 [Controller   3042] Running from node directory '/home/oberstet/node1/.crossbar'
2016-06-04T12:28:04+0200 [Controller   3042] Controller process starting (CPython-EPollReactor) ..
2016-06-04T12:28:04+0200 [Controller   3042] Node configuration loaded from 'config.json'
2016-06-04T12:28:04+0200 [Controller   3042] Node ID 'office-corei7' set from hostname
2016-06-04T12:28:04+0200 [Controller   3042] Using default node shutdown triggers [u'shutdown_on_worker_exit']
2016-06-04T12:28:04+0200 [Controller   3042] Joined realm 'crossbar' on node management router
2016-06-04T12:28:04+0200 [Controller   3042] Starting Router with ID 'worker-001'...
2016-06-04T12:28:04+0200 [Router       3049] Worker process starting (CPython-EPollReactor) ..
2016-06-04T12:28:05+0200 [Controller   3042] Router with ID 'worker-001' and PID 3049 started
2016-06-04T12:28:05+0200 [Router       3049] Realm 'realm1' started
2016-06-04T12:28:05+0200 [Controller   3042] Router 'worker-001': realm 'realm-001' (named 'realm1') started
2016-06-04T12:28:05+0200 [Controller   3042] Router 'worker-001': role 'role-001' (named 'anonymous') started on realm 'realm-001'
2016-06-04T12:28:05+0200 [Router       3049] Site starting on 8080
2016-06-04T12:28:05+0200 [Controller   3042] Router 'worker-001': transport 'transport-001' started
2016-06-04T12:28:05+0200 [Controller   3042] Starting Container with ID 'worker-002'...
2016-06-04T12:28:05+0200 [Container    3055] Worker process starting (CPython-EPollReactor) ..
2016-06-04T12:28:05+0200 [Controller   3042] Container with ID 'worker-002' and PID 3055 started
2016-06-04T12:28:05+0200 [Controller   3042] Container 'worker-002': component 'component-001' started
2016-06-04T12:28:05+0200 [Container    3055] subscribed to topic 'onhello'
2016-06-04T12:28:05+0200 [Container    3055] procedure add2() registered
2016-06-04T12:28:05+0200 [Container    3055] published to 'oncounter' with counter 0
2016-06-04T12:28:06+0200 [Container    3055] published to 'oncounter' with counter 1
2016-06-04T12:28:07+0200 [Container    3055] published to 'oncounter' with counter 2
2016-06-04T12:28:08+0200 [Container    3055] published to 'oncounter' with counter 3
2016-06-04T12:28:09+0200 [Container    3055] published to 'oncounter' with counter 4
2016-06-04T12:28:10+0200 [Container    3055] published to 'oncounter' with counter 5
2016-06-04T12:28:11+0200 [Container    3055] published to 'oncounter' with counter 6
2016-06-04T12:28:12+0200 [Container    3055] published to 'oncounter' with counter 7
2016-06-04T12:28:13+0200 [Container    3055] published to 'oncounter' with counter 8
2016-06-04T12:28:14+0200 [Container    3055] published to 'oncounter' with counter 9
2016-06-04T12:28:15+0200 [Container    3055] published to 'oncounter' with counter 10
2016-06-04T12:28:16+0200 [Container    3055] published to 'oncounter' with counter 11
2016-06-04T12:28:17+0200 [Container    3055] published to 'oncounter' with counter 12
2016-06-04T12:28:18+0200 [Container    3055] published to 'oncounter' with counter 13
2016-06-04T12:28:19+0200 [Container    3055] published to 'oncounter' with counter 14
2016-06-04T12:28:20+0200 [Controller   3042] Native worker connection closed uncleanly: A process has ended with a probable error condition: process ended by signal 9.
2016-06-04T12:28:20+0200 [Controller   3042] Node worker worker-002 ended with error ([Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ProcessTerminated'>: A process has ended with a probable error condition: process ended by signal 9.
])
2016-06-04T12:28:20+0200 [Controller   3042] Node worker ended, and trigger 'shutdown_on_worker_exit' active
2016-06-04T12:28:20+0200 [Controller   3042] Node shutting down ..
2016-06-04T12:28:20+0200 [Controller   3042] Shutting down node...
2016-06-04T12:28:20+0200 [Controller   3042] sending TERM to subprocess 3049
2016-06-04T12:28:20+0200 [Controller   3042] waiting for 3049 to exit...
2016-06-04T12:28:20+0200 [Controller   3042] sending TERM to subprocess 3055
2016-06-04T12:28:20+0200 [Router       3049] Native worker received SIGTERM - shutting down ..
2016-06-04T12:28:20+0200 [Router       3049] Shutdown of worker requested!
2016-06-04T12:28:20+0200 [Router       3049] Connection to node controller closed cleanly
2016-06-04T12:28:20+0200 [Router       3049] (TCP Port 8080 Closed)
2016-06-04T12:28:20+0200 [Controller   3042] Native worker connection closed cleanly.
2016-06-04T12:28:20+0200 [Controller   3042] Node worker worker-001 ended successfully
2016-06-04T12:28:20+0200 [Controller   3042] Node worker ended, and trigger 'shutdown_on_worker_exit' active
2016-06-04T12:28:20+0200 [Controller   3042] Node is already shutting down.
(cpy2711_1) oberstet@office-corei7:~/node1$ 
inovamatic commented 8 years ago

Oberstet,

I also thought that node exit after the php fatal error was the expected behavior, but what do i need to configure to have a node exit after the worker exit? Can you please describe your configuration? Why my worker exit after a php fatal errror only exits the worker?

Thanks

inovamatic commented 8 years ago

In your example i see that the worker that exited is a node worker and mine is a guest worker. Can i force the same behavior in a guest worker?

inovamatic commented 8 years ago

Hi Oberstet,

Can you please advise based on my 2 last comments? I would really like to do this the right way.

Tnanks