FreeOpcUa / python-opcua

LGPL Pure Python OPC-UA Client and Server
http://freeopcua.github.io/
GNU Lesser General Public License v3.0
1.36k stars 658 forks source link

How to manage BadServerHalted #1083

Open dave-cz opened 4 years ago

dave-cz commented 4 years ago

I have a program with SubHandler.datachange_notification() to save data from OPC to InfluxDB. Code here: https://github.com/FreeOpcUa/python-opcua/issues/845#issuecomment-638213121

Program is running on Ubuntu server as a service. Sometimes admin restarts OPC server and I get this error:

jun 25 13:56:36 ubuntu systemd[1]: Started opc_to_influx service.
jun 25 13:57:29 ubuntu opc_to_influx.sh[10832]: 2020-06-25 13:57:29,149+0200   INFO       opc_to_influx             create_subscription done
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]: ServiceFault from server received while waiting for publish response
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]: exception calling callback for <Future at 0x7fa7c124aad0 state=finished returned Buffer>
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]: Traceback (most recent call last):
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 324, in _invoke_callbacks
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:     callback(self)
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:   File "/home/user/.local/lib/python3.7/site-packages/opcua/client/ua_client.py", line 493, in _call_publish_callback
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:     self._uasocket.check_answer(data, "while waiting for publish response")
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:   File "/home/user/.local/lib/python3.7/site-packages/opcua/client/ua_client.py", line 93, in check_answer
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:     hdr.ServiceResult.check()
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:   File "/home/user/.local/lib/python3.7/site-packages/opcua/ua/uatypes.py", line 218, in check
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:     raise UaStatusCodeError(self.value)
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]: opcua.ua.uaerrors._auto.BadServerHalted: "The server has stopped and cannot process any requests."(BadServerHalted)
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]: ServiceFault from server received while waiting for publish response
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]: exception calling callback for <Future at 0x7fa7daacc650 state=finished returned Buffer>
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]: Traceback (most recent call last):
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 324, in _invoke_callbacks
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:     callback(self)
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:   File "/home/user/.local/lib/python3.7/site-packages/opcua/client/ua_client.py", line 493, in _call_publish_callback
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:     self._uasocket.check_answer(data, "while waiting for publish response")
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:   File "/home/user/.local/lib/python3.7/site-packages/opcua/client/ua_client.py", line 93, in check_answer
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:     hdr.ServiceResult.check()
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:   File "/home/user/.local/lib/python3.7/site-packages/opcua/ua/uatypes.py", line 218, in check
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]:     raise UaStatusCodeError(self.value)
jul 01 14:16:47 ubuntu opc_to_influx.sh[10832]: opcua.ua.uaerrors._auto.BadServerHalted: "The server has stopped and cannot process any requests."(BadServerHalted)

Program is not working but systemctl says "service is active (running)" - program doesn't die. It's a huge problem, I want to let program die and systemctl will restart it. It seems that any Thread in opcua client is still alive.

I tried signal.pause() (in code linked above),

while True:
    sleep(60)
    try:
        client.get_root_node()
    except Exception as ex:
        logger.exception(ex)
        break

and run my code via multiprocessing.Process() without a satisfactory result. I see no option where to handle server errors.

oroulet commented 4 years ago

you need to have a thread (probably main thread) in your application that read the server state every second. it will fail at any error and then you can disconnect/reconnect. There is no way out with these kind of network applications, we cannot rely on server warning us

dave-cz commented 4 years ago

@oroulet I understand that there is no way how to server can warn me. I'm finding a solution like "if i lose connection, my program will properly terminate".

I tried the while True loop in if __name__ == '__main__': of executed file but it didn't work. In program with SubHandler was Traceback written to stderr and not via logger.exception() in my code as I wrote above. In program "if a file was changed, load it and set_value() to OPC" program just stucked with no error or message.

SHAFIT commented 1 year ago

you need to have a thread (probably main thread) in your application that read the server state every second. it will fail at any error and then you can disconnect/reconnect. <

Did you mean this error - opcua.ua.uaerrors._auto.BadServerHalted: "The server has stopped and cannot process any requests."(BadServerHalted) - will automatically kill its thread ?