Closed adamf663 closed 3 years ago
I'm afraid my issue is solved. I've removed the /data/mosquitto.db file, and it stopped saving message to memory. I assume there was once a client that had connected with the durable flag, and it subscribed to a heavy traffic topic. I think mosquito will keep all messages in case that client reconnects.
glad you got it working, but this can't be a solution for those of us on Home Assistant OS unfortunately. Hope the update will get fixed anytime soon, did we see some response lately from the dev team? Just so we know it is under progress?
did we see some response lately from the dev team?
The thread isn't that long!
How did removing mosquitto.db work for you?
that's just the thing, Home Assistant OS users cant delete a mosquito.db, because it isn't even there ? that is to say, in the samba reachable folders
I haven't checked via samba and probably can't until I am on my LAN at home.
cool if you would, it is a most unpleasant state fo affairs we're in right now. Needing the update apparently (why else would it be available) , but not being able to...
Guys it's 2 weeks since the issue has been identified and confirmed that v5.1 is working properly why v5.1.1 hasn't been rolled back yet? I assume more and more people are being affected by this issue.
I assume more and more people are being affected by this issue.
Probably still not enough people. I suspect this error occurs only under some conditions not in every setup, otherwise it'd be investigated by now.
I think I have a path to overcome the problem. The comment from realthk got me thinking and I used the following procedure to get 5.1.1 working for me.
This solved the problem in my installation
that could well be, but, before even trying this, why not wait for the dev's to solve the issue? Or, put another way, why update in the first place at all, what do we gain by doing so, does anyone have an idea of the new features/functionality/advantages of 5.1.1 over 5.1.0 ?
I mean, missing out on this seems hardly a reason for immediate panic:
of course ymmv, but I wouldn't even know exactly what this does... or if I need it, guess not ;-)
Agreed. But I suspect that following releases will carry on the problem unless you cleanup your installation
For me - reinstalling the addon does not helped. I mean like initially, yes, there are no warnings, but after restarting zigbee2mqtt and zwave addons so they both re-add autodiscovery topics - you are back to same situation as before.
thats interesting. I had the exact situation with zwaveJS2mqtt and it is now running for over 12h as with 5.1
I must add that i still use old OpenZWave addon as i had no time to migrate yet. So maybe that's the problem? Did you used OpenZWave in the past? Maybe for you it helped because basically reinstall removed OpenZWave topics that are problematic? and for you they did not return, as you use now newer zwaveJS2mqtt and for me problem is back because OpenZWave is still there...
just assumming...
I used zwave2mqtt before and did the migration to zwaveJS2mqtt 4 weeks ago. So your assumption might be correct.
FWIW, I've also experienced this with a Debian 10 x64 (supervisor supported) installation. Have never used Z-Wave / Zigbee and broker delete+reinstall did not resolve. Also tried deleting and reinstalling the MQTT integration.
With reinstalls, HA either could not connect to the broker at all (per HA log) or repeatedly connected+disconnected due to "socket error" (per broker log). Reversion to a 5.1 snapshot always restored normal functionality.
I'm abandoning the addon broker and followed this guide to install MQTT directly on Debian. It was painless and quick with no functionality loss; only needed to rename some entities on my dashboard for cosmetic purposes. I suppose this setup is no longer supervisor compliant but the new broker seems snappier than ever.
Edit: neglected to mention--to get the new broker working, also had to "publish cmnd/tasmotas/so19 1" from one of my MQTT (Tasmota) device's web consoles to re-initiate HA discovery (might've happened on its own if I'd waited, not sure)
@chumbazoid I believe you are the only person on debian+supervised to report being affected by this. (Although most people unhelpfully don't say)
@nickrout he is not the only one!
What?
Probably the "only" is missing, as @chumbazoid is certainly not the only one: I'm also using HA Supervised on Debian 10, but have no time to experiment with removing-reinstalling MQTT (and also use a few topics to store information with retained messages), I'm fine with 5.1 for now.
My system is also a supervised deployment.
On Mar 19, 2021, at 5:38 PM, Henrik Tóth @.***> wrote:
Probably the "only" is missing, as @chumbazoid is certainly not the only one: I'm also using HA Supervised on Debian 10, but have no time to experiment with removing-reinstalling MQTT (and also use a few topics to store information with retained messages), I'm fine with 5.1 for now.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
Having this issue as well. High CPU usage
and logs gets this
[08:08:44] INFO: [INFO] found tasmota3 on local database
/bin/auth_srv.sh: line 17: echo: write error: Broken pipe
1616479725: Socket error on client
Re-install didn't help. I am having 18 local users in configfile logins: ... [18 local users]
anonymous: true customize: active: true folder: mosquitto certfile: fullchain.pem keyfile: privkey.pem require_certificate: false
access files set as
mosquitto $ cat acl.conf acl_file /share/mosquitto/accesscontrollist
and all users set in accesscontrollist mosquitto $ cat accesscontrollist user zigbee2mqtt topic readwrite # ...
Not sure if these are connected but I removed the Zigbee2mqttAssistant addon and rebooted my PI. Now situation seems to be normal and CPU load decent
that's just the thing, Home Assistant OS users cant delete a mosquito.db, because it isn't even there ? that is to say, in the samba reachable folders
Uninstalling the addon and reinstalling would remove the mosquitto.db
Uninstalling the addon and reinstalling would remove the mosquitto.db
It should NOT be necessary to remove mosquitto.db. This is a bug. It needs to be fixed. Why is there no dev input on this?
uninstalling and reinstalling does not fix the problem. High cpu and devices not functioning.
If you want to remove mosquitto.db (at your own risk) then there are two ways to do it that I can see.
Log into the base operating system and remove the file
rm /usr/share/hassio/addons/data/core_mosquitto/mosquitto.db
Or
Enter the addon container and remove the file
docker exec -it addon_core_mosquitto /bin/bash
rm /data/mosquitto.db
I think I have a path to overcome the problem. The comment from realthk got me thinking and I used the following procedure to get 5.1.1 working for me.
1. Take a snapshot (complete) from your current home assistant instance
In order to do this I re-installed it and re-started it - and then 5.1.1 worked fine. I don't know whats different to the situation before but it works now....
Yes, I had the exact same experience. Also scratching my head why it works now.
Thanks for the tips, i also reinstalled the addon but it did not help with the issue on my installation. Also deleted the database manually, still no luck!
I can reproduce the problem very easily with restarting the addon while connected with MQTT-Explorer and trying to send messages as soon as it reconnects. It takes quite some time until the messages get accepted.
Can anyone with the problem verify this?
do you mean v5.1 doesn't work for you? or you are failing with v5.1.1?
i tested with version 5.1.1, did i get this wrong?
were these tip all for version 5.1 ... i missed that in the thread.
testing addon with version 5.1 (just restored only this addon from an old backup) and it works again.
Thanks guys for the temporary solution!
Sigh.... Rolled back to 5.1, and still can't get my Sonoff Zigbee bridge to work :(
Esphome things are offline while deconz things do work properly. Curious...
For reference ; https://community.home-assistant.io/t/mosquitto-5-1-1add-on-is-broken/286979/11
It seems 5.1.1 changed a LOT of stuff instead of just the mosquitto broker
For those who want to rollback 5.1:
- Backup and uninstall mosquitto 5.1.1
- Fork mosquitto repository, edit "version": "5.1.1" to "version": "5.1." in config.json
- Add this custom repository in the supervisor's add-on store and install I installed mosquitto 5.1 newly with this method, hope this help
Hi.
I try this but I cant add custom repo to supervisor
21-04-03 07:36:20 ERROR (MainThread) [supervisor.store.git] Can't clone https://github.com/NetJaro/addons/tree/master/mosquitto/ repository: Cmd('git') failed due to: exit code(128)
can confirm that, upon cogneato's suggestion in Discord, deleting the 5.1 add-on, (copying the config) and re-installing (the now new 5.1.1) Add-on with the copied config, everything is running smoothly. NO errors in the log, and all topics are live.
So, don't update, but re-install which gets you the new version (and rewrites the mosquito.db) which seems to be the issue. which essentially is what @christoph-luebbe said here https://github.com/home-assistant/addons/issues/1887#issuecomment-802234693
I also can confirm that removing and re-adding the add-on works well. Please note the configuration first. Remove add-on, wait a minute so local storage is removed.
I also restarted home assistent. Then installed add-on, check if username and password is in configuration of mosquitto and the mqtt integration (see configuration of mqtt in the integrations page, of configuration.yaml if configured via .yaml)
Hello,
@Mariusthvdb @bsmeding Unfortunately that is not my experience. I did remove my add-on and installed it from scratch. The MQTT disconnections I was suffering (with my shelly devices) did go away compared with upgrading normally from 5.1 to 5.1.1. However, with add-on version 5.1.1, MQTT seems to be much more slow, connecting to the broker with MQTTExplorer takes significantly more time with 5.1.1 than 5.1. Also core logs get flooded with "No ACK from MQTT server in 10 seconds" errors as in here. I have not seen any other log entries (not in core of from add-on) between 5.1 and 5.1.1 that might point to some explanation but if any more logs are needed to help track this problem I am happy to help.
I am using Home Assistant OS on a raspberry pi 4. This is my config info:
version | core-2021.4.3 |
---|---|
installation_type | Home Assistant OS |
dev | false |
hassio | true |
docker | true |
virtualenv | false |
python_version | 3.8.7 |
os_name | Linux |
os_version | 5.4.83-v8 |
arch | aarch64 |
timezone | Europe/Madrid |
host_os | Home Assistant OS 5.13 |
---|---|
update_channel | stable |
supervisor_version | supervisor-2021.03.9 |
docker_version | 19.03.15 |
disk_total | 457.7 GB |
disk_used | 23.7 GB |
healthy | true |
supported | true |
board | rpi4-64 |
supervisor_api | ok |
version_api | ok |
installed_addons | Samba share (9.3.1), chrony (2.0.2), Home Assistant Google Drive Backup (0.103.1), Grafana (6.3.0), AdGuard Home (4.0.0), InfluxDB (4.0.4), ESPHome (1.16.2), TasmoAdmin (0.14.1), Zigbee2mqtt (1.18.1-1), motionEye (0.11.1), WireGuard (0.5.1), AppDaemon 4 (0.5.2), Visual Studio Code (3.3.0), Terminal & SSH (9.1.0), File editor (5.2.0), Check Home Assistant configuration (3.6.0), JupyterLab (0.5.0), Z-Wave JS (0.1.17), Mosquitto broker (5.1), Network UPS Tools (0.6.2) |
Does the log from mqtt addon give more information? Also does all the devices in mqtt topics be slower or only specific ones?
I have almost all devices connected via mqtt (zigbee, zwave, milight) after you’re message i tried several in the homeassistant topic and cannot find any slower responding devices.
Dis you reconfigure mqtt server in the mqtt integration?
Op zo 11 apr. 2021 om 19:01 schreef jrhbcn @.***>
Hello,
@Mariusthvdb https://github.com/Mariusthvdb @bsmeding https://github.com/bsmeding Unfortunately that is not my experience. I did remove my add-on and installed it from scratch. The MQTT disconnections I was suffering (with my shelly https://shelly.cloud/ devices) did go away compared with upgrading normally from 5.1 to 5.1.1. However, with add-on version 5.1.1, MQTT seems to be much more slow, connecting to the broker with MQTTExplorer http://mqtt-explorer.com/ takes significantly more time with 5.1.1 than 5.1. Also core logs get flooded with "No ACK from MQTT server in 10 seconds" errors as in here https://github.com/bieniu/ha-shellies-discovery/issues/116. I have not seen any other log entries (not in core of from add-on) between 5.1 and 5.1.1 that might point to some explanation but if any more logs are needed to help track this problem I am happy to help.
I am using Home Assistant OS on a raspberry pi 4. This is my config info: System Health version core-2021.4.3 installation_type Home Assistant OS dev false hassio true docker true virtualenv false python_version 3.8.7 os_name Linux os_version 5.4.83-v8 arch aarch64 timezone Europe/Madrid Home Assistant Supervisor host_os Home Assistant OS 5.13 update_channel stable supervisor_version supervisor-2021.03.9 docker_version 19.03.15 disk_total 457.7 GB disk_used 23.7 GB healthy true supported true board rpi4-64 supervisor_api ok version_api ok installed_addons Samba share (9.3.1), chrony (2.0.2), Home Assistant Google Drive Backup (0.103.1), Grafana (6.3.0), AdGuard Home (4.0.0), InfluxDB (4.0.4), ESPHome (1.16.2), TasmoAdmin (0.14.1), Zigbee2mqtt (1.18.1-1), motionEye (0.11.1), WireGuard (0.5.1), AppDaemon 4 (0.5.2), Visual Studio Code (3.3.0), Terminal & SSH (9.1.0), File editor (5.2.0), Check Home Assistant configuration (3.6.0), JupyterLab (0.5.0), Z-Wave JS (0.1.17), Mosquitto broker (5.1), Network UPS Tools (0.6.2)
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/home-assistant/addons/issues/1887#issuecomment-817338790, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADTPXC7ZQMVMILT3SG65H2LTIHIUZANCNFSM4YJHM6YA .
Does the log from mqtt addon give more information? Also does all the devices in mqtt topics be slower or only specific ones? I have almost all devices connected via mqtt (zigbee, zwave, milight) after you’re message i tried several in the homeassistant topic and cannot find any slower responding devices.
Not that I can see, both logs seem normal indicating connections from clients normally. I use MQTT with both shelly, zigbee and others (system_sensors) with more than 50 devices. It seems to be more related to shelly devices than zigbee or others (but haven't checked 100% so it is more a feeling).
Dis you reconfigure mqtt server in the mqtt integration?
Yes, actually in one of my 5.1 <-> 5.1.1 installations (do not remember when) the default config stopped working so I had to create a new user in the "logins" section and re-configure the MQTT integration with that manual user.
Thanks for looking into this!!
I'm having similar issues. But I cannot compare to an older installation because I just started using Home Assistant this weekend. I'm trying to connect arduino MQTT devices with the MQTT broker and initially they connect successfully with messages being sent in between.
However after a few minutes it disconnects and Arduino cannot connect to it anymore until I reset it. When trying to reconnect, Arduino is just waiting for a response, in the mean time MQTT broker logs say that the device connected successfully. After some time, depending on my timeout setting broker states the device exceeded timeout. At that point arduino also says that the server exceeded timeout and it just loops this way. So it seems to me that arduino expects some sort of response which doesn't come from the broker.
This is not an issue with the network because it works perfectly on first attempt after each reset and never succeeds when attempting to reconnect.
My arduino also did not disconnect when I tried setting up a public broker instead of using the adding.
I've tried everything I could find online and right now this seems to be the only open end left, which i cannot test since I can't downgrade the addon.
EDIT: I've tried adding my own forked repository to my home assistant but it just doesn't add...
Hello,
@Mariusthvdb @bsmeding However, with add-on version 5.1.1, MQTT seems to be much more slow, connecting to the broker with MQTTExplorer takes significantly more time with 5.1.1 than 5.1. Also core logs get flooded with "No ACK from MQTT server in 10 seconds" errors as in here.
I dont see these errors in the logs anymore, but I can confirm the very long connection time to MQTT explorer. This was almost immediate before, and now takes 45 seconds. Have only 3 clients for the broker, 1 of which is my Owntracks which hasn't even been active yet. the other is a HA integration (on the same HA instance and the add-on broker) with some bluetooth trackers, the other is a dedicated Zwave hub.
Ill try to downgrade once more to see if this helps.
update
yep, snappy as before, took less than 5 seconds. so definitely something going on.
Thanks to those of you who have suggested methods for getting things working on 5.1.1. I've tried three times and so far haven't had any luck. I think I've tracked the problem down to the HA MQTT Integration not seeing the new broker but am stuck beyond that. Hoping someone might be able to give me a nudge.
Here are the steps I've taken so far:
Supervisor | Dashboard | Mosquitto broker | Configuration | (save info there to .txt file) Supervisor | Dashboard | Mosquitto broker | Info | Uninstall (reboot system) Supervisor | Add-on Store | Search for mosquitto | reinstall (reboot system)
I seem to see favorable info in Supervisor | Dashboard | Mosquitto broker | Log, but after waiting at least 30 minutes, I've got MANY entities that are unavailable. I can use MQTTExplorer to log on to the broker and see that my (cached?) entities are there, but they just show as "unavailable" in HA.
Under 5.1 if I go to Configuration | Integrations | MQTT | Configure | Re-configure MQTT, leave the defaults, and click Submit, I'm able to set various options, click Submit, and see the "Success!" message.
But after upgrading to 5.1.1 if I go to Configuration | Integrations | MQTT | Configure | Re-configure MQTT, leave the defaults (which should be, I'm assuming, the same settings that work with 5.1), I see "Failed to connect."
So I think I'm seeing that the problem lies not with Mosquitto broker version 5.1.1 but perhaps with the Integration located in Configuration.
Ideas? Suggestions? Thanks.
@grantalewis here's a few things that come to mind
You may have done this but on the 2 occasions that I've had a total breakdown of the Zigbee2mqtt (devices present but no MQTT messages reaching the HAS) I've found that re-flashing the coordinator works to 'reboot' the system. Maybe this is the 'nudge' your problem needs to be solved?
https://github.com/Koenkk/Z-Stack-firmware/tree/master/coordinator/Z-Stack_Home_1.2/bin/default
Thinking about pushing other stuff at the Mosquito Broker and HAS:
Can you run any other Service2mqtt such as Z-Wave 2 MQTT - I'm guessing not otherwise you'd have said/ So moving on ...
Perhaps installing a MQTT service such as
https://gitlab.com/iotlink/iotlink
on a Windows PC and testing if you can scrape the MQTT messages from that to the HAS successfully. That will help figure out id it's just Zigtbee2mqtt or all MQTT messages?
But after upgrading to 5.1.1 if I go to Configuration | Integrations | MQTT | Configure | Re-configure MQTT, leave the defaults (which should be, I'm assuming, the same settings that work with 5.1), I see "Failed to connect."
@grantalewis, Do you have a user defined in "logins" section of the configuration of the addon? and using that user in the HA mqtt integration. Something similar happened to me when testing 5.1 <-> 5.1.1 installation and at some point I had to manually set it up (I guess it was automatically done when I installed HA for the first time).
@grantalewis try to change the username and password in both the addon and the integration page
I found in the code that there was a default homeassistant user and password in mqtt addon and think that is removed or broken. Mu setup prior to 5.1.1 had a homeassistant user in the integration but that username and password was not visible in the addon configuration
Reset on both sides maybe solve the connection issue
Edit: o see now similar answer as @jrhbcn
After reverting to 5.1 to resolve issues, decided to give 5.1.1 another try based on some of the suggestions here, and so far so good. Here's what I did: -Backup backup backup (lol) -Stopped zigbee2mqtt plugin and set it not to start on boot -Uninstalled mosquitto plugin -Deleted HASS local user which I was using for MQTT (mosquitto) auth -Rebooted the entire HASSIO box -Installed mosquitto plugin (5.1.1), adjusted configuration to use a login specified there instead of a HASS user. -Started mosquitto plugin -Started monitoring "#" on "listen to topic" so I could observe all messages coming in -Restarted all Tasmota devices, waited for each of them to check in and finish re-adding retain messages -Started Zigbee2mqtt plugin, set to start on boot, observed messages start coming in, waited until all retain messages were re-added (hundreds in my case, took few minutes) -Once the flood of initial messages stopped, I tested my devices and observed none of the slowdown or latency that happened when I did the first in-place 5.1 --> 5.1.1 update. -Only seeing a socket error from one device which I will need to investigate separately, but no other errors so far.
@srnoth have you tried connecting to the mqtt server using MQTT-Explorer? For me using v5.1 works OK but on version v5.1.1 just takes ages to connect and start showing messages (6-7 seconds vs 45 seconds). Furthermore, all my zigbee2mqtt also seem to work ok with v5.1.1 but not my shelly devices (through mqtt shelly discovery script) than report ACK timeouts in the logs.
@srnoth have you tried connecting to the mqtt server using MQTT-Explorer? For me using v5.1 works OK but on version v5.1.1 just takes ages to connect and start showing messages (6-7 seconds vs 45 seconds). Furthermore, all my zigbee2mqtt also seem to work ok with v5.1.1 but not my shelly devices (through mqtt shelly discovery script) than report ACK timeouts in the logs.
Just tested and it connects and starts displaying messages within a couple seconds. Takes a but longer than that to load all.
The problem
Environment
Problem-relevant configuration
Traceback/Error logs
Additional information