moralmunky / Home-Assistant-Mail-And-Packages

Home Assistant integration providing day of package counts and USPS informed delivery images.
MIT License
573 stars 71 forks source link

Version 0.3.27 Making ZHA Intergraion To Reload Every 30 Minutes #906

Closed AlexKusnezov closed 1 day ago

AlexKusnezov commented 2 weeks ago

Describe the bug It might be a real edge case but this is what happened to me after updating this intergation from 0.3.26 to 0.3.27:

Environment (please complete the following information):

System Information

version core-2024.6.0
installation_type Home Assistant Container
dev false
hassio false
docker true
user root
virtualenv false
python_version 3.12.2
os_name Linux
os_version 6.6.31+rpt-rpi-v8
arch aarch64
timezone Europe/Berlin
config_dir /config

Logs You will find the logs in the HA Community link below

firstof9 commented 2 weeks ago

Nothing in your log indicates an issue with this integration.

clintkev251 commented 2 weeks ago

I'm also seeing some kind of impact from 0.3.27. I had noticed some small gaps in metrics being scraped by Victoria Metrics over the last couple days. I was also noticing some very significant delays with things like lighting activation. Looking at the logs from Victoria Metrics it was reporting timeouts trying to access the metrics endpoint (and I could also replicate this very intermittently manually calling the metrics endpoint where the call would take 10 + sec when normally the call completes in <100 ms)

Victoria metrics scrape timeouts:

Screenshot 2024-06-07 224419

I enabled debug logging globally in HA and checked a 1 sec period around several of the points where Victoria Metrics reported a timeout and in each case, the bulk of the logs were from mail and packages updating.

After disabling the integration, the timeouts have stopped for around the last hour. I'll continue to monitor, however it looks to me like something is causing HA to effectively lock up while the mail and packages update is running. Memory and CPU usage at the time don't seem outside of normal ranges

I'll also provide a sample from one of my debug logs during a period where a scrape timed out. Let me know if there are any other data points I can provide that would be helpful. In my case, I'm running HA container on Talos 1.7.3. I could observe the same behavior on 2024.5.5 and 2024.6.1

Explore-logs-2024-06-07 22_49_04.txt


After a night of letting things sit, I'm very confident that this was resolved by disabling the mail and packages integration. I've had no scrape timeouts since:

Screenshot 2024-06-08 090853

redjab commented 2 weeks ago

Commenting to add that I experienced the same thing. Nothing relevant in logs even with debugging on, but whenever this integration ran, HA would become unresponsive for ~20-30 seconds and my Zigbee and MQTT integrations would disconnect. Downgrading to 0.3.25 resolved that (I updated from 0.3.25 to 0.3.27, so I reverted to 0.3.25)

Gamerayers commented 1 week ago

Can confirm. I have the exact same issue. Moved my home assistant from my openwrt router docker, to my unraid. Reset it, started over with the n100 openwrt docker setup. Couldn't get mail and packages to connect since it kept saying stuff and needing a string, but when it has any setup done, it was causing my zha to break. I found this looking for the fix for the string selection, and realize now that I don't want mail and package till this is fixed.

clowncracker commented 5 days ago

I had a somewhat related issue, where I was having timeouts for Spotify, Sony Songpal, & Frigate constantly (hundreds of times a day). Disabling this integration made it so I haven't had any issues in over 12 hours.

I tried reenabling this integration and the errors came back. This is 100% the root cause.

firstof9 commented 3 days ago

Please try 0.3.29b0 and see if this resolves your issue.

clowncracker commented 3 days ago

@firstof9 I've not had any issues since upgrading to version 0.3.29b1.

redjab commented 2 days ago

@firstof9 0.3.29b1 has been working well for me as well!

firstof9 commented 2 days ago

@AlexKusnezov @clintkev251 What are your results?

clintkev251 commented 2 days ago

Updated and re-enabled it this morning, so far so good. I'll continue to monitor, but it looks like this is resolved for me. Thanks for the effort

firstof9 commented 1 day ago

Fixed in 0.3.29