hassio-addons / addon-emqx

EMQX - Home Assistant Community Add-ons
MIT License
57 stars 13 forks source link

After upgrade to 0.5.0 EMQX no longer starts. #93

Closed johnhoogeveen closed 2 months ago

johnhoogeveen commented 3 months ago

My current version is 0.4.1 and running good. After upgrading to 0.5.0 EMQX no longer starts, the log I get is below


s6-rc: info: service s6rc-oneshot-runner: starting s6-rc: info: service s6rc-oneshot-runner successfully started s6-rc: info: service base-addon-banner: starting

Add-on: EMQX The most scalable open-source MQTT broker for IoT. An alternative for the Mosquitto add-on Add-on version: 0.5.0 You are running the latest version of this add-on. System: Home Assistant OS 12.1 (amd64 / generic-x86-64) Home Assistant Core: 2024.3.3 Home Assistant Supervisor: 2024.03.1

s6-rc: info: service base-addon-banner successfully started s6-rc: info: service fix-attrs: starting s6-rc: info: service base-addon-timezone: starting s6-rc: info: service base-addon-log-level: starting s6-rc: info: service fix-attrs successfully started [16:52:26] INFO: Configuring timezone (Europe/Amsterdam)... s6-rc: info: service base-addon-log-level successfully started s6-rc: info: service base-addon-timezone successfully started s6-rc: info: service legacy-cont-init: starting s6-rc: info: service legacy-cont-init successfully started s6-rc: info: service init-emqx: starting s6-rc: info: service init-emqx successfully started s6-rc: info: service emqx: starting s6-rc: info: service emqx successfully started s6-rc: info: service legacy-services: starting [16:52:26] INFO: Starting EMQX... s6-rc: info: service legacy-services successfully started [16:52:26] INFO: Setting EMQX_NODENAME to core-emdx EMQX_PLUGINS__INSTALL_DIR [plugins.install_dir]: /data/emqx/plugins EMQX_RPCPORT_DISCOVERY [rpc.port_discovery]: manual EMQX_NODEDATA_DIR [node.data_dir]: /data/emqx/data EMQX_NODE__COOKIE [node.cookie]: ** EMQX_NODENAME [node.name]: core-emdx 2024-04-03T16:52:29.769146+02:00 [error] failed_to_check_schema: emqx_conf_schema 2024-04-03T16:52:29.776520+02:00 [error] #{reason => integrity_validation_crash,stacktrace => [{emqx_conf_schema,validate_cluster_strategy,1,[{file,"emqx_conf_schema.erl"},{line,1494}]},{hocon_tconf,assert_integrity,4,[{file,"hocon_tconf.erl"},{line,182}]},{hocon_tconf,assert_integrity,3,[{file,"hocon_tconf.erl"},{line,176}]},{hocon_tconf,map,4,[{file,"hocon_tconf.erl"},{line,304}]},{hocon_tconf,map_translate,3,[{file,"hocon_tconf.erl"},{line,99}]},{hocon_tconf,generate,3,[{file,"hocon_tconf.erl"},{line,93}]},{hocon_cli,generate,1,[{file,"hocon_cli.erl"},{line,317}]},{escript,run,2,[{file,"escript.erl"},{line,750}]}],exception => {error,{badmatch,["core-emdx"]}},kind => validation_error,validation_name => check_node_name_and_discovery_strategy} ERROR: call_hocon_failed: -v -t 2024.04.03.16.52.28 -s emqx_conf_schema -c /data/emqx/data/configs/cluster.hocon -c /opt/emqx/etc/emqx.conf -d /data/emqx/data/configs generate(B [16:52:31] INFO: Service EMQX exited with code 1 (by signal 0) s6-rc: info: service legacy-services: stopping s6-rc: info: service legacy-services successfully stopped s6-rc: info: service emqx: stopping s6-rc: info: service emqx successfully stopped s6-rc: info: service init-emqx: stopping s6-rc: info: service init-emqx successfully stopped s6-rc: info: service legacy-cont-init: stopping s6-rc: info: service legacy-cont-init successfully stopped s6-rc: info: service fix-attrs: stopping s6-rc: info: service base-addon-timezone: stopping s6-rc: info: service base-addon-log-level: stopping s6-rc: info: service fix-attrs successfully stopped s6-rc: info: service base-addon-timezone successfully stopped s6-rc: info: service base-addon-log-level successfully stopped s6-rc: info: service base-addon-banner: stopping s6-rc: info: service base-addon-banner successfully stopped s6-rc: info: service s6rc-oneshot-runner: stopping s6-rc: info: service s6rc-oneshot-runner successfully stopped


When revert to the backup for version 0.4.1 EMQX works fine again

ViKozh commented 3 months ago

I also got the same bug with the same validation error on 0.5.0 version of add-on.

johnhoogeveen commented 3 months ago

Just a small update, after upgrading HA to 2024.4.0 the problem remains so for now I disabled the auto update feature as last night EMQX was upgraded to 0.5.0 and my lights refused to work :-)

StrausFuenf commented 3 months ago

Did you change the env var EMQX_NODE__NAME? Cause the errormessage says check_node_name_and_discovery_strategy.

Try to add @a0d7b954-emqx behind your node name and update. After that it should work

johnhoogeveen commented 3 months ago

The hostname according to HA is already a0d7b954-emqx When I login to ssh of HA and ping that name I do get a response

But if I look in the vars for EMQX I have

When I change that to

and restart EMQX it uses the new name, however, the known login for the webinterface of EMQX no longer works and all mqtt traffic is down. Reverting to the value core-emdx makes everything work again. Below the log that was generated using the a0d7b954-emqx hostname.

6-rc: info: service s6rc-oneshot-runner: starting s6-rc: info: service s6rc-oneshot-runner successfully started s6-rc: info: service base-addon-banner: starting Add-on: EMQX The most scalable open-source MQTT broker for IoT. An alternative for the Mosquitto add-on Add-on version: 0.4.1 There is an update available for this add-on! Latest add-on version: 0.5.0 Please consider upgrading as soon as possible. System: Home Assistant OS 12.1 (amd64 / generic-x86-64) Home Assistant Core: 2024.4.1 Home Assistant Supervisor: 2024.03.1 Please, share the above information when looking for help or support in, e.g., GitHub, forums or the Discord chat. s6-rc: info: service base-addon-banner successfully started s6-rc: info: service fix-attrs: starting s6-rc: info: service base-addon-timezone: starting s6-rc: info: service base-addon-log-level: starting s6-rc: info: service fix-attrs successfully started [19:17:09] INFO: Configuring timezone (Europe/Amsterdam)... s6-rc: info: service base-addon-log-level successfully started s6-rc: info: service base-addon-timezone successfully started s6-rc: info: service legacy-cont-init: starting s6-rc: info: service legacy-cont-init successfully started s6-rc: info: service init-emqx: starting s6-rc: info: service init-emqx successfully started s6-rc: info: service emqx: starting s6-rc: info: service emqx successfully started s6-rc: info: service legacy-services: starting [19:17:09] INFO: Starting EMQX... s6-rc: info: service legacy-services successfully started [19:17:09] INFO: Setting EMQX_NODENAME to a0d7b954-emqx EMQX_PLUGINS__INSTALL_DIR [plugins.install_dir]: /data/emqx/plugins EMQX_RPCPORT_DISCOVERY [rpc.port_discovery]: manual EMQX_NODEDATA_DIR [node.data_dir]: /data/emqx/data EMQX_NODE__COOKIE [node.cookie]: ** EMQX_NODENAME [node.name]: a0d7b954-emqx Listener tcp:default on 0.0.0.0:1883 started. Listener ssl:default on 0.0.0.0:8883 started. Listener ws:default on 0.0.0.0:8083 started. Listener wss:default on 0.0.0.0:8084 started. Listener http:dashboard on :18083 started. EMQX 5.5.1 is running now!

StrausFuenf commented 3 months ago

Ah thats not what I mean EMQX_NODE__NAME changes the Name of your node not the Server/Container Hostname. (For Clustering this Var needs nodename@hostname since 5.6 thats why its crashing.) That its not working anymore is normal since all your connections connect to core-emdx which is not known to emqx anymore. Try to change the var to core-emdx@a0d7b954-emqx after that the traffic with 5.5.1 should work and it should also work with 5.6.

Hope that makes it clear

johnhoogeveen commented 3 months ago

Ah, than I understood it incorrectly. But, after making that change and restarting the addon I still do not get traffic and it seems that the addon configuration is different. eq, default password is active and after login my user database is empty. But, when I upgrade 0.4.1 to 0.5.0 it still starts so that part seems to be fixed. A new challenge, how do I migrate the settings from my running 0.4.1 (with the incorrect EMQX_NODE__NAME) to a new setup :-)

johnhoogeveen commented 2 months ago

I have accepted the fact that my configuration is gone when I change the nodename. So the steps to fix the upgrade issue are

  1. Change the var to core-emdx@a0d7b954-emqx and restart the addon.
  2. Login to EMQX using the default credentials and change the password
  3. Recreate the MQTT users in access Control => Authentication

After performing these steps I verified is all MQTT devices where connecting correctly. Upgrade the EMQX addon to 0.5.0

johnhoogeveen commented 2 months ago

I will close this issue as it was created by a misconfiguration from my side when deploying my first EMQX addon.