zwave-js / zwave-js-ui

Full featured Z-Wave Control Panel UI and MQTT gateway. Built using Nodejs, and Vue/Vuetify
https://zwave-js.github.io/zwave-js-ui
MIT License
963 stars 205 forks source link

[bug] App restart loop when zwave-js server throws `EADDRNOTAVAIL` error #2703

Closed perhusaas closed 1 year ago

perhusaas commented 2 years ago

I have a fully functioning Z-Way/Z-wave.me network with about 50 devices. Z-Way is running in a docker, so it was easy to stop that docker, and start a ZwaveJS UI docker to try it out.

The docker start up fine, and connects to the (local) UZB stick, the log starts out nicely, taking in configuration details.

This is how it looks in the logs, repeatedly restarting the driver after [SetSerialApiTimeouts] call: ... 2022-10-03T21:23:34.902Z CNTRLR setting serial API timeouts: ack = 1000 ms, byte = 150 ms 2022-10-03T21:23:34.914Z SERIAL » 0x01050006640f97 (7 bytes) 2022-10-03T21:23:34.914Z DRIVER » [REQ] [SetSerialApiTimeouts] payload: 0x640f 2022-10-03T21:23:34.951Z SERIAL « [ACK] (0x06) 2022-10-03T21:23:34.953Z SERIAL « 0x01050106640f96 (7 bytes) 2022-10-03T21:23:34.954Z SERIAL » [ACK] (0x06) 2022-10-03T21:23:39.081Z DRIVER ███████╗ ██╗ ██╗ █████╗ ██╗ ██╗ ███████╗ ██╗ ███████╗ ╚══███╔╝ ██║ ██║ ██╔══██╗ ██║ ██║ ██╔════╝ ██║ ██╔════╝ ███╔╝ ██║ █╗ ██║ ███████║ ██║ ██║ █████╗ █████╗ ██║ ███████╗ ███╔╝ ██║███╗██║ ██╔══██║ ╚██╗ ██╔╝ ██╔══╝ ╚════╝ ██ ██║ ╚════██║ ███████╗ ╚███╔███╔╝ ██║ ██║ ╚████╔╝ ███████╗ ╚█████╔╝ ███████║ ╚══════╝ ╚══╝╚══╝ ╚═╝ ╚═╝ ╚═══╝ ╚══════╝ ╚════╝ ╚══════╝ 2022-10-03T21:23:39.082Z DRIVER version 10.2.0 2022-10-03T21:23:39.082Z DRIVER 2022-10-03T21:23:39.083Z DRIVER starting driver... 2022-10-03T21:23:39.102Z DRIVER opening serial port /dev/zwave 2022-10-03T21:23:39.111Z DRIVER serial port opened 2022-10-03T21:23:39.112Z SERIAL » [NAK] (0x15) 2022-10-03T21:23:39.116Z DRIVER loading configuration... 2022-10-03T21:23:39.129Z CONFIG Using external configuration dir /usr/src/app/store/.config-db 2022-10-03T21:23:39.129Z CONFIG version 10.2.0 2022-10-03T21:23:40.044Z DRIVER beginning interview... 2022-10-03T21:23:40.046Z DRIVER added request handler for AddNodeToNetwork (0x4a)... 1 registered ...

Any Ideas? Everything still works fine when I boot up the z-way docker again..

robertsLando commented 2 years ago

Cannot see anything strange in these logs, could you send full logs please? Both driver and application?

perhusaas commented 2 years ago

Sure, i'll upload later, but it seems to be a repetitive pattern of the same.. I tried to set log-level: Silly, not sure if i made it stick? One result of this is that the driver resetting bogs down the web interface as well, getting "network error" while saving/reloading the settings..

erikced commented 2 years ago

Have you disabled soft reset under Settings->Z-Wave, UZB1 does not work with it enabled (mine did not at least).

perhusaas commented 2 years ago

zwavejs_2022-10-04.log z-ui_2022-10-04.log

Log files attached.

I disabled soft reset during the test, but except for one of the subsequent restarts, it restarts at the same point mentioned above every time?

robertsLando commented 2 years ago

I cannot see anything that closes the application, not an error log or else, even there is no signal intercepted to say the application is closing. Seems there is something else that force closes the app, how did you deploy the application? What command are you using to run it? Are you using docker-compose?

cc @AlCalzone

perhusaas commented 2 years ago

Yes, it's deployed by docker compose, using the official latest image. I'll upload the compose file later.

robertsLando commented 2 years ago

Try also to run docker-compose without -d to see if the logs there show something about the reason it crash

perhusaas commented 2 years ago

Here's my docker-compose: version: "3.9" services: zwave-js-ui: container_name: zwave-js-ui image: zwavejs/zwave-js-ui:latest restart: always tty: true stop_signal: SIGINT networks: HAnett: ipv4_address: 192.168.0.3 volumes: - type: bind source: /etc/localtime target: /etc/localtime - type: volume source: Z2MQTTconfig target: /usr/src/app/store environment: - SESSION_SECRET=mysupersecretkey - ZWAVEJS_EXTERNAL_CONFIG=/usr/src/app/store/.config-db # Uncomment if you want log times and dates to match your timezone instead of UTC # Available at https://en.wikipedia.org/wiki/List_of_tz_database_time_zones #- TZ=Europe/Oslo devices: - "/dev/serial/by-id/usb-0658_0200-if00:/dev/zwave" #your device here ports: - "8891:8091" # port for web interface - "3333:3333" # port for Z-Wave JS websocket server networks: HAnett: name: HAnett driver: bridge ipam: config: - subnet: 192.168.0.0/16 volumes: Z2MQTTconfig: name: Z2MQTTconfig

Haven't really grasped the concept of the supersecretkey, whats it used for?

The last of the docker log:

2022-10-05 10:37:48.757 DEBUG Z-WAVE: Node 81 has been added to nodes array 2022-10-05 10:37:48.758 DEBUG Z-WAVE: Binding to node 82 events 2022-10-05 10:37:48.758 DEBUG Z-WAVE: Node 82 has been added to nodes array 2022-10-05 10:37:48.759 DEBUG Z-WAVE: Binding to node 83 events 2022-10-05 10:37:48.759 DEBUG Z-WAVE: Node 83 has been added to nodes array 2022-10-05 10:37:48.762 INFO Z-WAVE: Scanning network with homeid: 0xc84d9ef4 2022-10-05 10:37:48.766 INFO Z-WAVE: Node 1: interview started Error: listen EADDRNOTAVAIL: address not available <HostIP>:3333 at Server.setupListenHandle [as _listen2] (node:net:1289:21) at listenInCluster (node:net:1354:12) at doListen (node:net:1492:7) at processTicksAndRejections (node:internal/process/task_queues:84:21)

3333 is the port i set up for Home Assistant, haven't tested connecting HA yet, as I wanted to get the network up first.

I just tried to disable the WS server, and now the driver stays up! it seems to work as it should now.

3333 is available on the Host, what did I do wrong?

robertsLando commented 2 years ago

Error: listen EADDRNOTAVAIL: address not available :3333

That's the error so, it's an error in your docker-compose file, try using another address

Haven't really grasped the concept of the supersecretkey, whats it used for?

That is used to create tokens when you have auth enabled in settings

perhusaas commented 2 years ago

Ok, but, 3333 is open and available on the host, mabye its the IP address considering that i also have an internal docker network?

I'd recommend an robustification of the code to prevent this behaviour in this condition, the SW became quite useless when the driver kept resetting.. :) I could hardly update the settings to correct the error.

Other than this issue, the system looks very nice, thanks to all contributors for making good software!

robertsLando commented 2 years ago

I'd recommend an robustification of the code to prevent this behaviour in this condition, the SW became quite useless when the driver kept resetting.. :) I could hardly update the settings to correct the error.

EADDRNOTAVAIL mostly means the host used in bind for @zwave-js/server is not valid. You should try to fix it by changing the related configuration in settings manually: https://github.com/zwave-js/zwave-js-ui/blob/master/lib/ZwaveClient.ts#L1361

You can edit it in settings.json file you find in store folder. It's under zwave settings serverHost property, by default it is 0.0.0.0 see https://github.com/zwave-js/zwave-js-ui/blob/977ea07169cb64d11c060aacc8480911df3f4dc0/src/components/Settings.vue#L1038

This is not an issue with the application itself but a configuration error between your docker setup and settings

perhusaas commented 2 years ago

I agree, and have found the json file. My comment was more to the behaviour in an error condition, you would perhaps get less issues posted here if the error didn't make the UI so erratic? ;) I would suggest i.e. a maximum number of driver restart attempts, as that continous behaviour prevented me from altering settings by UI. This will perhaps also lower the bar for new users like me who are not that fluent in programming and linux language.

Again, not a big deal, just a suggestion.

I'll close this thread when i've fixed the address error and post the conclusion.

robertsLando commented 2 years ago

I would suggest i.e. a maximum number of driver restart attempts, as that continous behaviour prevented me from altering settings by UI. This will perhaps also lower the bar for new users like me who are not that fluent in programming and linux language.

This is a specific case that I never faced, basically the app should never close in that way, I could catch the error and prevent the app to exit

AlCalzone commented 2 years ago

Maximum number of restart attempts or an exponential backoff strategy where each restart waits a bit longer could make sense.

Although stopping the restarts means after a longer unavailability of the stick, user intervention is needed, which isn't good. I guess the most important issue is being able to change settings even if something throws on startup.

Does that error need to be caught/handled in the zwave-js-server repo though, or here?

robertsLando commented 2 years ago

Does that error need to be caught/handled in the zwave-js-server repo though, or here?

It makes more sense if I handle it on my side

I guess the most important issue is being able to change settings even if something throws on startup.

This in fact is what happens, this is just an error that never happened to any user so fixing it with a try catch would be enough

perhusaas commented 2 years ago

..Why is the z-stick driver connected to the websocket server settings at all?

In any case, I realised that i don't need to expose 3333 at all considering that HA and Z2MQTT has a common internal docker network. Everything is fine now, and HA connected with no problem. :)

robertsLando commented 2 years ago

Why is the z-stick driver connected to the websocket server settings at all?

If you mean why it is under zwave prop in settings json it's because It is something I use in ZwaveClient class and all ZwaveClient related settings are there, BTW it's not something a user should ever worry about, that file it's just a place to store settings and should never be edited manually, expect some cases like this

robertsLando commented 2 years ago

By double checking code seems I already catch start errors and by looking at zwavejs server implementation the listen is done there.

@raman325 Any clue what could be the root cause of this? It's an error thrown by zwavejs server listen

robertsLando commented 1 year ago

Follow updates of https://github.com/zwave-js/zwave-js-server/issues/782