Closed gjobin closed 4 months ago
Can confirm. Restults in crashes randomly.
Seems to have the same on a RPI4. After install of HAOS 12.0, the HA don't boot anymore. I have a PI4 which boots on a SSD.
EDIT : A hard reset solves the issue
I tried installing 12.0 from scratch then restore my backup (partial or complete) with the same result after a bit.
Does it mean it only starts to happen after you restore the configuration, but the vanilla OS doesn't show these issues?
Can you share any details of the HW the TrueNAS OS is running on?
The symptoms are similar to out-of-memory issues, do you have any insights about the memory usage of the VM?
Seems to have the same on a RPI4. After install of HAOS 12.0, the HA don't boot anymore. I have a PI4 which boots on a SSD.
EDIT : A hard reset solves the issue
Do you mean you reinstalled home assistant?
I tried installing 12.0 from scratch then restore my backup (partial or complete) with the same result after a bit.
Does it mean it only starts to happen after you restore the configuration, but the vanilla OS doesn't show these issues?
Can you share any details of the HW the TrueNAS OS is running on?
The symptoms are similar to out-of-memory issues, do you have any insights about the memory usage of the VM?
I have not tried to create a fresh config on the fresh installation, but it did boot up and allow me to restore the configurations, yes. I also did not wait longer to validate if it would fail after a while, sitting waiting for initialization.
This is my host machine currently running all my apps and HOAS 11.5
My current 11.5 VM is configured this way
With the 12.0 OS crashing, I did try to bump both Minimum Memory Size
and Memory Size
to 6 GiB
, without success.
EDIT : This is what the /config/hardware
page shows in 11.5 :
thanks, looks like this wasn't/isn't the problem with my device. For me all addons seem to be broken and as I tried to access it via a proxy manager no connection could be established.
@gjobin This really looks like the HA VM goes out of memory - the Memory graph in HA does show the actual memory consumption (without buffers/caches), so if it's hovering around 98%, it means it's getting out of memory and probably swapping heavily, showing the symptoms you describe. Here's memory usage of my instance, running way more custom integrations and add-ons than yours:
It can't be ruled out that the OS update triggered something to misbehave, for start I will start restarting HA in the safe mode to check if any custom integrations isn't to blame. But most likely the memory consumption was always on the edge even in 11.5 and with some of the recent changes it just went too high.
I also recommend setting the "Minimum memory size" and actual "Memory size" to the same value for the VM. I expect this to disable memory ballooning, i.e. the hypervisor will allocate the fixed amount of RAM instead of increasing it on demand. This can also rule out some lower-level issues.
I have similar issues, os crashes randomly (less than once per day). Here is the call stack of latest crash:
Seems to be related to usb. I took out my bluetooth adapter to see if that is the cause.
It can't be ruled out that the OS update triggered something to misbehave, for start I will start restarting HA in the safe mode to check if any custom integrations isn't to blame. But most likely the memory consumption was always on the edge even in 11.5 and with some of the recent changes it just went too high.
I also recommend setting the "Minimum memory size" and actual "Memory size" to the same value for the VM. I expect this to disable memory ballooning, i.e. the hypervisor will allocate the fixed amount of RAM instead of increasing it on demand. This can also rule out some lower-level issues.
It makes sense to allocated static memory, so I did set a fixed amount of RAM to 8GiB (for now) and here what it looks like now on 11.5. Interestingly I have added more integrations/plugins on 11.5 than when I started this thread. I am pretty new to HA and was still adding integrations.
How do I start in safe mode, after updating, once it is crashing and unresponsive ?
I just redid the update to 12 with 8GiB and it seems stable so far.
Here is current usage
Would you think it's Okay to bring it back to 4 GiB ?
Also, in between my initial report of the issue, there has been both, a Core and a Supervisor update. Iwonder if they might have fixed any potential Memory issue.
Edit : changed "Core and a Supervisor issue" to "Core and a Supervisor update"
I just redid the update to 12 with 8GiB and it seems stable so far.
Hmm, that looks good indeed. I wonder if there isn't something wrong with the ballooning driver in the newer kernel :thinking: If you're willing to do some more tests, could you set the "minimum memory size" to 512M again and check if it starts to eat the RAM again?
Would you think it's Okay to bring it back to 4 GiB ?
My guess is that it should be okay to do so. I'd say that many people run it on systems with that (or even lower) amount of RAM.
Also, in between my initial report of the issue, there has been both, a Core and a Supervisor issue. Iwonder if they might have fixed any potential Memory issue.
I am not aware of any recent issues in Core or Supervisor causing memory to leak, so likely not.
Changed it back to 512MiB /4GiB. System is hanging again. Moved to 1GiB/8GiB, this is what I see on the hardware page:
It seems to me that your assumption is right.
And to further prove it, I set it back to 4GiB/4GiB without any issues.
Glad it's working for me now. But it seems at least my issue is reproductible. Which is always a good news .
Thank you for your investigation !
I have the same issue with proxmox from 12.0 I will definitely check my VM memory config when back home... (I know that there's 4GB allocated but I am not sure about minimum and I don't have access to it from the office)
Same issue when running 12.0 on virtual machine manager of a synology NAS. There is definitely a memory problem and several processes are killed by the kernel´s Out Of Memory Killer. You can see that on console messages. These are not allways the same processes. Sometimes it is even impossible to get a console connection and a complete virtual machine power cycle is required. Downgrade to 11.1 or 11.5 and everything works fine.
Same problem. My pi3+ rape me from 12.0 update. Random rebooting
Same here. Running on Synology VMM and got freezes randomly every few hours since the latest Update. Seems like it run out of memory, because I once got that message in the console.
And the I got problems like:
ha > [ 7413.4383091 CIFS: VFS: server 192.168.20.2 does not advertise interfaces I 7413.440727] CIFS: VFS: server 192.168.20.2 does not advertise
or
ha > [30798.303328] systemd[1]: systemd-resolved.service: Watchdog timeout (limit 3min)! [30899.5774191 systemd-coredump[39466]: Process 109 (systemd-journal) of user 0 dumped core.
And sometimes there is no message, because the console is frozen.
This also happens with 12.1.
There hasn't been any activity on this issue recently. To keep our backlog manageable we have to clean old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant OS version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
Describe the issue you are experiencing
Followed this tutorial to initially install HAOS on TrueNAS scale as a VM.
Host :
Symptoms :
Add-ons :
Integrations (Other than default) :
What operating system image do you use?
generic-x86-64 (Generic UEFI capable x86-64 systems)
What version of Home Assistant Operating System is installed?
11.5
Did you upgrade the Operating System.
Yes
Steps to reproduce the issue
Anything in the Supervisor logs that might be useful for us?
Anything in the Host logs that might be useful for us?
System information
System Information
Home Assistant Community Store
GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 5000 Installed Version | 1.34.0 Stage | running Available Repositories | 1407 Downloaded Repositories | 6 HACS Data | okHome Assistant Cloud
logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | okHome Assistant Supervisor
host_os | Home Assistant OS 11.5 -- | -- update_channel | stable supervisor_version | supervisor-2024.02.0 agent_version | 1.6.0 docker_version | 24.0.7 disk_total | 48.5 GB disk_used | 6.0 GB healthy | true supported | true board | ova supervisor_api | ok version_api | ok installed_addons | Studio Code Server (5.15.0), Advanced SSH & Web Terminal (17.1.1), Cloudflared (5.1.4), Home Assistant Google Drive Backup (0.112.1)Dashboards
dashboards | 1 -- | -- resources | 0 mode | auto-genRecorder
oldest_recorder_run | February 23, 2024 at 1:45 AM -- | -- current_recorder_run | February 26, 2024 at 6:22 PM estimated_db_size | 37.74 MiB database_engine | sqlite database_version | 3.44.2Additional information
No response