home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.85k stars 30.1k forks source link

Crashing everyday #43155

Closed RavD666 closed 3 years ago

RavD666 commented 3 years ago

home assistant crashing everyday# The problem

Hass.io keeps crashing everyday,I can see it on my router. Have disconnect the power to boot it back up. Its only started doing it since the last few updates.

How can I get the logs to find the issue

Environment

Problem-relevant configuration.yaml

Traceback/Error logs

Additional information

pipetboy commented 3 years ago

I have a very similar experience. I have Hass.io running on a Raspberry Pi3b+ with an SSD. Starting ~2-3 weeks ago it occasionally crashes. Sometimes it's stable for days, then all of a sudden it freezes. I read somewhere that the cause could be a corrupt database, so I deleted it. Only to find it crashing again ~10h later. I kept an eye on the logs all day, did not see anything strange. Processor was stable at ~10%, load around 1. Strange thing: using Fing, I can see that the Pi is still on the network, but I can not connect to it at all.

How to troubleshoot?

muzzak123 commented 3 years ago

Same here. RPi 4. Operating System: Home Assistant OS 5.9, Home Assistant 2020.12.1. Power Supply is a genuine RPi 4 Plug Pack. Ever since last upgrade it became unstable and kept crashing. I thought it might be a faulty SD card so moved to an SSD but it didn't fix the issue. When it crashes I can't get to it from the network. Won't even ping. The only way to fix it seems to be pull the plug and restart and then it all works for a while.

frenck commented 3 years ago

We need logs, without them, this issue is too generic to do anything with.

Additionally, @RavD666, please fill out the issue template as much as possible. There is now data missing.

muzzak123 commented 3 years ago

Yeah I know it's pretty vague. I've tried to diagnose it further, but how do I get a copy of the log up to the point it crashed ? When I pull the plug and reboot the HA log seems to be cleared and reset by the new boot.

frenck commented 3 years ago

At this point, I dunno what crashed. The Home Assistant logs, for example, are in the configuration folder. For OS issues, you could check on debug console of anything is visible (or maybe even displayed on the HDMI port).

muzzak123 commented 3 years ago

In my case, when the system crashes the PI is unreachable. Not even pingable so I cant view the config folder. When I pull the plug and reboot, there is only the new log which seems to be just a record of the new boot, and not what happened prior to the reboot.

frenck commented 3 years ago

See my previous comment on HDMI.

muzzak123 commented 3 years ago

Perhaps if HA was made, on boot up, to take a copy of the old log before creating the new one, it could help solve these sort of issues ? Kodi does something similar. The logs are written from Kodi startup to Kodi shutdown (or crash). The next time you start Kodi, the existing kodi.log is renamed Kodi.old.log and a new kodi.log is created for this new session. In effect you only ever have two logs available- the current one, and the previous one.

frenck commented 3 years ago

Log are there and multiple ways are already given on how to get them. We need logs for this issue report, without it, there is nothing we can do.

muzzak123 commented 3 years ago

Hi - sorry if the whole system has crashed and is unresponsive. ie not even pingable then how do I get to the Home Assistant log just before the crash ? If I reboot to make the system responsive again then all I can see is the new log. Can you please provide more detail because I don't understand what you mean above ?

frenck commented 3 years ago

Anything visible on the HDMI port when it has crashed?

muzzak123 commented 3 years ago

No - Nothing on either port

frenck commented 3 years ago

? That makes no sense, the HDMI will show something. Always.

muzzak123 commented 3 years ago

When I reboot the HDMI shows the startup process and then stops with the screen showing : Welcome to Home Assistant hassio login: and that's how it stays all the time it is running until it crashes

andretoma commented 3 years ago

I’ve the same problem

andretoma commented 3 years ago

I’ve the same problem ha latest version on raspberry Pi4 32 bit. At 64 bit installation the problem is more frequent

muzzak123 commented 3 years ago

Yeah - I dont think anyone is going to look at this unless we have a log and we can't get a log because it is overwritten when we restart. I voted for an option here https://community.home-assistant.io/t/better-logging/185694 in which a few ideas are suggested to fix this problem. You might want to consider adding your voice.

As an interim fix/solution I flashed a power plug with Tasmota and included "define USE_PING" Plugged the Pi power supply into this plug. Then followed the instructions here https://tasmota.github.io/docs/Rules/#watchdog-for-wi-fi-router to create a watchdog rule that pings the PI every 3 minutes. If it is unreachable then the power plug turns itself off for 10 seconds and then on. This crudely but effectively reboots the Pi when it crashes.

hdehaseleer commented 3 years ago

I have the same problem. Home Assistant crashes at least once a week. Until now, only during the night. A ping to my PI still responds. But no automation is done anymore. The UI on my PC will not connect anymore with the core on the Pi.

I have to pull out the power plug (hard reboot). And then the log file seems to be empty. With Samba, I don't find a log file from before my hard reboot. So I've no idea of what the origin of the problem is. Is there no way to get information from an old log file?? (from before the hard reset)?

I'll connect my PI to an old timer which powers off my PI every night.

I have a Raspberry PI modem 4. Latest version 5.10 and 2020.12.7. Utilization of RAM and SSD less then 10%. CPU even less than 1%

bobslaede commented 3 years ago

I have the same problem. Totally up to date. Pi 4. SSD with /mnt/data. Did it before moving data to SSD as well. Mine crashes every other day almost. Sometimes it will run for 4-5 days.

andretoma commented 3 years ago

I have changed sd card but I have the same problem...

bobslaede commented 3 years ago

Mine crashed during the night. There are no real logs of it. I can see it dropping off of my LAN at around 3 in the morning. It is hardwired with a static lease.
Nothing too special was happening,

Here is my Supervisor logs, which for some reason is an hour behind.


21-01-19 01:56:11 INFO (MainThread) [supervisor.auth] Auth request from 'core_mosquitto' for 'Iot'  
21-01-19 01:56:11 INFO (MainThread) [supervisor.homeassistant.api] Updated Home Assistant API token  
21-01-19 01:56:11 INFO (MainThread) [supervisor.auth] Successful login for 'Iot'  
21-01-19 05:54:27 INFO (SyncWorker_7) [supervisor.docker.addon] Starting Docker add-on homeassistant/aarch64-addon-mosquitto with version 5.1```

3:05 it loses connection, which would be 2:05 in the Supervisor log. 
I had to yank the power this morning. 
fjfricke commented 3 years ago

Unfortunately I have the same problem. But it is simply not possible to pinpoint what is the reason for the crash which makes homeassistant unreliable. Somehow keeping old log files for at least a day would help a lot.

fjfricke commented 3 years ago

There is this feature request which would make it easier to find the problem: https://community.home-assistant.io/t/access-log-file-version-before-home-assistant-restart-in-hass-io/103294

wishmaster1984 commented 3 years ago

Same problem here. Pi3b+ 64 bit HAS latest version

Tovrin commented 3 years ago

Same problem here. I'm not really up on Linux so I'm not sure how to access the logs. I can get them if someone can tell me how. Or if there's a way to put to logs on the HA folder structure, I can samba into it.

hdehaseleer commented 3 years ago

Six days later and I had exactly the same problem this night No automation was done. I could still logon and open the overview menu. Opening the other menus was giving this error

crash2

Again, the last temperature measurement of my devices was at 02h07.


This morning I had a strange "half crash"

Strange behavior: No automation anymore + I could logon and set lights manually + no access to the other menus + no history of temperature anymore from 02:00 for KNX AND Zwave

antoinetielbeke commented 3 years ago

? That makes no sense, the HDMI will show something. Always.

Mine has been crashing everday for a couple of months (even surviving all the updates). Only option is to unplug the power cable (official raspberry pi cable + samsung SD card). When it crashes and I plug in the HDMI it shows nothing, but when I reboot it, it will show video output.

bobslaede commented 3 years ago

Even though my power supply was rated for 15W I tried getting a new one, and for a week now, it hasn't crashed.

muzzak123 commented 3 years ago

In my case, I seriously doubt this is a power issue. I have tried 2 separate power supplies, an official RPI4 supply and 50W (5V 10Amp) supply. The issue still happens regardless. I don't think it is hardware related as I have never changed my hardware except to try to resolve this issue. I didn't have this issue for the majority of 2020. It only started late 2020, for versions around the time of the new versioning schema. So far it continues for every upgrade including 2021.1.5

JonasSL commented 3 years ago

Same problem here. Running on a Pi4 2GB.

djdemer commented 3 years ago

Same problem here... I'm brand new and have just finished building up my new HAS platform. Home Assistant OS 5.10 running on a new 3B+. Its been two weeks of integration, development and learning. I am finally satisfied with my Dashboard and checking it several times a day. My automations have been dropping or panels unresponsive. Only fix has been to power off and on. Which brought me over to this forum looking for what others do to look for dumps/logs? I have the 'Share Diagnostics' in the Supervisor tab enabled, I'm guessing this feature may not get used much, don't know? Well anyway, I see others share my pain. I'll stop looking for a solution now and patiently wait for another release update.

Tovrin commented 3 years ago

This now getting to the point where I think I may have to automate some form of reboot of my Home Assistant. I'm thinking of running a seperate VM with a bare-bones HASSIO, Node-Red, ping tools and a wifi switch to reboot my main HASSIO host when it becomes non-responsive. It shouldn't have come to this, but it has.

I'd love it if someone could tell me how to get the crash logs. I'm not used to Linux, so I have no idea. Maybe the crash logs would help .... I just need to know how to get at them.

Salamandar commented 3 years ago

Same here, RPI4 with Deconz + Conbee, it crashes every couple of days. A full reinstall and config restore on a new SD card did not help. We initially thought it was due to the bluetooth USB dongle, but without it the issue persists.

What's weird is that on my other install (that uses ZHA+Elelabs Zigbee), i NEVER had this issue since install 3 months ago.

I'm going to try to get the systemd logs before rebooting the Pi, let's hope there's something useful in them.

djdemer commented 3 years ago

I have been running fine lately, why? I'll throw out this idea -- my server is enrolled with Nabu Casa for my Smartthings integration. Could interruptions/outages on Nabu Casa possibly effect my HA instance overnight?

Tovrin commented 3 years ago

I just had another crash overnight. The host went down completely. Not able to ping. Nothing. Nada.

So after reboot, there's no access to the HASSIO logs ... but if someone can tell me how to turn on the syslogs, maybe that could help ...... can ANYONE tell me?

Toukite commented 3 years ago

Hello, I would also love to have this feature request implemented https://community.home-assistant.io/t/better-logging/185694 !!!! I tried to do something similar with automation and events, but there are too many side things to be taken care of, I'm not good enough. I think many people use HassOS on RPI and would love to see logs rotation/retention coming to help investigate crashes

juite commented 3 years ago

Hi everybody,

Also the same thing. After the, kind off mandatory, security update, HA constantly crashing. Upgraded yesterday to 2021.2 in the hope things would be better. But a few minutes ago it crashed again. @frenck I would like to deliver you somekind of logging. Can you tell where I can find the logging you want? Happy to send you the files. You are also welcome here in Almelo to take a look :)

Br, Jeroen

Moved to Hyper-V from Raspberry Pi 4. No issues anymore...

blerrgh commented 3 years ago

Mine also just started doing this after upgrading to 2021.2. I also installed Z-Wave JS but haven't configured it yet, so I don't think it's that. Otherwise, I didn't make any changes.

Supervised Home Assistant on raspberry pi3 goes unresponsive at least once a day and requires a hard reboot.

andretoma commented 3 years ago

I solved by removing the case... My pi4 was blocking because of the high temperature. The more or less challenging activities needed to make some plugins work made us think of their problems. It was not so, it was enough to cool my pi4

Andrea

Il giorno 6 feb 2021, alle ore 00:46, blerrgh notifications@github.com ha scritto:

 Mine also just started doing this after upgrading to 2021.2. I also installed Z-Wave JS but haven't configured it yet, so I don't think it's that. Otherwise, I didn't make any changes.

Supervised Home Assistant on raspberry pi3 goes unresponsive at least once a day and requires a hard reboot.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Tovrin commented 3 years ago

I solved this by installing on an old NUC that I had lying about and wasn't using.

hdehaseleer commented 3 years ago

@andretoma: My RPI model 4 has a closed case but has also a fan. As you can see in the screenshot below, the temperature is not going higher than 35°C. Very low for a CPU. But the CPU usage is not going higher than 5%

Indeed, we need a better logging. When I have to do a power off/on, the log files are erased. Not knowing what happened.

image

eizemazal commented 3 years ago

Hi, I am new to Home Assistant and just installed it on my brand new Pi 4B. Got very similar problem to those above. The system is unstable - every few minutes or hours I get messages like

Unable to load the panel source: /api/hassio/app/entrypoint.js.

for some (not all) of the views in admin interface. I can log out and in and browse part of the interface. Then I need to power cycle the Pi to get rid of the error.

Once this happened while I was connected over SSH, and it seems that root filesystem became unavailable, but the shell did not freeze and I could invoke top to see running processes.

Before, I have experimented with HA installed on my mac in Python venv, and it worked without any issues.

Any suggestions?

andretoma commented 3 years ago

Temperature or motioneye plugin

Andrea

Il giorno 7 feb 2021, alle ore 20:34, au1985 notifications@github.com ha scritto:

 Hi, I am new to Home Assistant and just installed it on my brand new Pi 4B. Got very similar problem to those above. The system is unstable - every few minutes or hours I get messages like

Unable to load the panel source: /api/hassio/app/entrypoint.js.

for some (not all) of the views in admin interface. I can log out and in and browse part of the interface. Then I need to power cycle the Pi to get rid of the error.

Once this happened while I was connected over SSH, and it seems that root filesystem became unavailable, but the shell did not freeze and I could invoke top to see running processes.

Before, I have experimented with HA installed on my mac in Python venv, and it worked without any issues.

Any suggestions?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

eizemazal commented 3 years ago

Temperature or motioneye plugin Andrea

I do not have neither this plugin, nor any video processing in my system. Temperature was my first suggestion - at first, I assembled Pi without fan because did not expect high load. But then, adding the fan did not bring any improvement. Load average is around 0.01. My system is idling, and quadcore ARM should not be heavily loaded by Linux with a few daemons...

short4bmoney commented 3 years ago

Similar issues for me, until recently I thought I was over the hill. Not so... but somehow I was able to access the Samba Share drive even though Hass.io was completely unresponsive. Here's the last few lines of my logs, where the last ~10 or so show the fails to restart and load logs with timeout errors:

2021-02-08 16:37:36 WARNING (MainThread) [homeassistant.components.media_player] Updating samsungtv media_player took longer than the scheduled update interval 0:00:10
2021-02-08 16:37:36 WARNING (MainThread) [homeassistant.helpers.entity] Update of media_player.samsung_un55ku630d is taking over 10 seconds
2021-02-08 16:47:31 ERROR (stream_worker) [homeassistant.components.stream.worker] Error opening stream rtsp://hassio:tinker@192.168.0.49/live
2021-02-08 16:51:51 WARNING (Thread-8) [pychromecast.socket_client] [Living room speaker(192.168.0.247):8009] Error communicating with socket, resetting connection
2021-02-08 17:07:17 ERROR (stream_worker) [homeassistant.components.stream.worker] Error opening stream rtsp://hassio:tinker@192.168.0.49/live
2021-02-08 17:27:13 ERROR (stream_worker) [homeassistant.components.stream.worker] Error opening stream rtsp://hassio:tinker@192.168.0.49/live
2021-02-08 17:32:01 ERROR (Thread-8) [homeassistant.components.cast.media_player] Failed to cast media https://cabrillobay.duckdns.org/api/tts_proxy/65289a60d7c8fc5245f07de32e5bfcc5212fb124_en_-_google_translate.mp3 from external_url (https://cabrillobay.duckdns.org). Please make sure the URL is: Reachable from the cast device and either a publicly resolvable hostname or an IP address
2021-02-08 17:32:39 ERROR (Thread-8) [homeassistant.components.cast.media_player] Failed to cast media https://cabrillobay.duckdns.org/api/tts_proxy/65289a60d7c8fc5245f07de32e5bfcc5212fb124_en_-_google_translate.mp3 from external_url (https://cabrillobay.duckdns.org). Please make sure the URL is: Reachable from the cast device and either a publicly resolvable hostname or an IP address
2021-02-08 17:47:19 ERROR (stream_worker) [homeassistant.components.stream.worker] Error opening stream rtsp://hassio:tinker@192.168.0.49/live
2021-02-08 18:07:34 ERROR (stream_worker) [homeassistant.components.stream.worker] Error opening stream rtsp://hassio:tinker@192.168.0.49/live
2021-02-08 18:28:00 ERROR (stream_worker) [homeassistant.components.stream.worker] Error opening stream rtsp://hassio:tinker@192.168.0.49/live
2021-02-08 18:48:36 ERROR (stream_worker) [homeassistant.components.stream.worker] Error opening stream rtsp://hassio:tinker@192.168.0.49/live
2021-02-08 19:09:22 ERROR (stream_worker) [homeassistant.components.stream.worker] Error opening stream rtsp://hassio:tinker@192.168.0.49/live
2021-02-08 19:30:18 ERROR (stream_worker) [homeassistant.components.stream.worker] Error opening stream rtsp://hassio:tinker@192.168.0.49/live
2021-02-08 19:51:23 ERROR (stream_worker) [homeassistant.components.stream.worker] Error opening stream rtsp://hassio:tinker@192.168.0.49/live
2021-02-08 20:12:39 ERROR (stream_worker) [homeassistant.components.stream.worker] Error opening stream rtsp://hassio:tinker@192.168.0.49/live
2021-02-08 20:18:39 ERROR (MainThread) [homeassistant.components.hassio.http] Client timeout error on API request supervisor/stats
2021-02-08 20:18:40 ERROR (MainThread) [homeassistant.components.hassio.http] Client timeout error on API request supervisor/logs
2021-02-08 20:18:40 ERROR (MainThread) [homeassistant.components.hassio.http] Client timeout error on API request core/stats
2021-02-08 20:26:15 ERROR (MainThread) [homeassistant.components.hassio.handler] Timeout on /homeassistant/restart request
2021-02-08 20:30:54 ERROR (MainThread) [homeassistant.components.hassio.http] Client timeout error on API request core/stats
2021-02-08 20:30:54 ERROR (MainThread) [homeassistant.components.hassio.http] Client timeout error on API request supervisor/logs
2021-02-08 20:30:54 ERROR (MainThread) [homeassistant.components.hassio.http] Client timeout error on API request supervisor/stats
2021-02-08 20:31:14 ERROR (MainThread) [homeassistant.components.hassio.http] Client timeout error on API request supervisor/logs

EDIT When I finally got the restart to work, nothing loads and my logs only say:

2021-02-08 21:12:16 WARNING (MainThread) [homeassistant.loader] You are using a custom integration for hacs which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant.
2021-02-08 21:12:16 WARNING (MainThread) [homeassistant.loader] You are using a custom integration for ezviz_cloud which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant.
2021-02-08 21:12:42 WARNING (MainThread) [homeassistant.setup] Setup of http is taking over 10 seconds.
frenck commented 3 years ago

So, I see a lot of "I have the same" comments, however, no one has actually supplied any logs that show the issue. It is hard to say one has the "same" problem, if we don't know what the problem is.

So at this point, please be aware, you could have a different problem than the issue author.

As for the logs, please, start sharing information on things you can see on the HDMI port when crashing (or VM console in case you run it that way).

Additionally, the systemd logs are available on the host OS, which is available over a longer period of time (even after reboot); See also:

https://developers.home-assistant.io/docs/operating-system/debugging

In conclusion, we don't need better logs.. things are logged. We need logs to be provided. That is a different thing.

Salamandar commented 3 years ago

@frenck I just got the logs from a faulty system, and the only errors I see are corrupted sqlite errors.

I removed the sqlite as documented somewhere in the community forum, lost the history as expected. For now it runs, we'll see if it stops crashing every now and then.

Ah, and we changed the sdcard used, the previous one probably was the reason for the corrupted db.

Tovrin commented 3 years ago

@frenck I've asked several times HOW to get my logs when the /var/syslog is empty, but that's fallen on deaf ears and no-one ever replied. I gave up in the end and put it on a NUC.

frenck commented 3 years ago

@Salamandar Hm. SQLite issues, should not be able to completely make a system unreachable. A die-ing SD card, however can 😟

@Tovrin systemd log, as documented in the link above. Additionally, if a system crashes completely (as indicated as a symptom by the original author) something should be visible on the main console (via e.g., HDMI or VM console). This was already requested and hinted at in the earlier posts.

Salamandar commented 3 years ago

@frenck I agree… But that was the only thing visible in systemd logs. Next time it crashes, if it does, I'll ask my brother to pull the /var/log/messages too (if it's in the data partition).