home-assistant / operating-system

:beginner: Home Assistant Operating System
Apache License 2.0
4.8k stars 959 forks source link

Problem getting SMB share disk usage after upgrading to HA OS 11.3 #3041

Closed ovizii closed 8 months ago

ovizii commented 8 months ago

Describe the issue you are experiencing

I am using:

HA OS 11.3 HA Core 2024.1.0 HA Supervisor 2023.12.0

When Frigate runs shutil.disk_usage(path) it results in this error:

2024-01-05 11:49:24.300409714 BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/frigate/recordings'

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

11.3

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

  1. Install frigate and configure
  2. All working fine
  3. Upgrade HA OS from 11.2 to 11.3
  4. See above error description

Anything in the Supervisor logs that might be useful for us?

`2024-01-05 11:49:24.300409714  BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/frigate/recordings'`

Anything in the Host logs that might be useful for us?

Error while setting up systemmonitor platform for sensor
12:48:57 – (ERROR) Sensor

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity_platform.py", line 360, in _async_setup_platform
    await asyncio.shield(task)
  File "/usr/src/homeassistant/homeassistant/components/systemmonitor/sensor.py", line 393, in async_setup_entry
    disk_arguments = await hass.async_add_executor_job(get_all_disk_mounts)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/systemmonitor/util.py", line 22, in get_all_disk_mounts
    usage = psutil.disk_usage(part.mountpoint)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/psutil/__init__.py", line 2012, in disk_usage
    return _psplatform.disk_usage(path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/psutil/_psposix.py", line 177, in disk_usage
    st = os.statvfs(path)
         ^^^^^^^^^^^^^^^^
BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/Music'

System information

System Information

version core-2024.1.0
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.11.6
os_name Linux
os_version 6.1.70-haos
arch x86_64
timezone Europe/Bucharest
config_dir /config
Home Assistant Community Store GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 5000 Installed Version | 1.33.0 Stage | running Available Repositories | 1366 Downloaded Repositories | 10
Home Assistant Cloud logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | ok
Home Assistant Supervisor host_os | Home Assistant OS 11.3 -- | -- update_channel | stable supervisor_version | supervisor-2023.12.0 agent_version | 1.6.0 docker_version | 24.0.7 disk_total | 48.5 GB disk_used | 12.8 GB healthy | true supported | true board | ova supervisor_api | ok version_api | ok installed_addons | Check Home Assistant configuration (3.11.0), Log Viewer (0.16.0), Tailscale (1.56.1.1), Samba Backup (5.2.0), Advanced SSH & Web Terminal (17.0.1), Squeezelite (0.0.17), Mosquitto broker (6.4.0), Zigbee2MQTT (1.35.0-1), Studio Code Server (5.14.2), Frigate (Full Access) (0.12.1), Music Assistant BETA (2.0.0b80)
Dashboards dashboards | 1 -- | -- resources | 3 views | 1 mode | storage
Recorder oldest_recorder_run | 31 December 2023 at 15:10 -- | -- current_recorder_run | 5 January 2024 at 12:48 estimated_db_size | 57.61 MiB database_engine | sqlite database_version | 3.41.2

Additional information

No response

sairon commented 8 months ago

Please provide correct logs - i.e. Host and Supervisor ones. They should hopefully contain something useful. Unfortunately, I am not able to reproduce the issue here with CIFS shares so there has to be something specific in your environment.

ovizii commented 8 months ago

I'll give it a try, here are logs from

HOST:

Jan 05 16:50:23 ha-ct systemd[1]: run-docker-runtime\x2drunc-moby-f0c19d2f5ab43f466ccfb03620fe4d78dfdf274853b9a4af9caffb783027717c-runc.eAgyBS.mount: Deactivated successfully.
Jan 05 16:50:26 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:26 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:26 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:30 ha-ct kernel: CIFS: VFS: \\192.168.98.2 Send error in SessSetup = -11
Jan 05 16:50:53 ha-ct systemd[1]: run-docker-runtime\x2drunc-moby-f0c19d2f5ab43f466ccfb03620fe4d78dfdf274853b9a4af9caffb783027717c-runc.jhgq7y.mount: Deactivated successfully.
Jan 05 16:50:58 ha-ct kernel: CIFS: VFS: \\192.168.98.2 Send error in SessSetup = -11
Jan 05 16:51:23 ha-ct systemd[1]: run-docker-runtime\x2drunc-moby-

SUPERVISOR does not seem to contain anything related.

sairon commented 8 months ago

The return code is EAGAIN (= something tells "Try again"), but can't find where it's coming from. There might be some more lines that could have more hints in previous logs, as the first line when a share is mounted is CIFS: Attempting to mount \\.... Could you (or anybody else) provide the whole dmesg | grep CIFS output?

Also, what is the SMB host? Can you get any logs from that when this happens?

ovizii commented 8 months ago

Here's my try. The output of dmesg | grep CIFS | more

image

The rest of the logs are all the same, no other entries. Sadly I have minimized samba logs. Would need to up the log level and restart. Or was it lower the log level? hm... Lets see if anyone else can contribute first.

My Samba is this docker container running on another host: https://github.com/uPagge/samba

jweston2112 commented 8 months ago

My Attempt and what looks like Pertinent info from the HOST Log

an 05 18:46:24 homeassistant dropbear[11374]: [11374] Jan 05 18:46:24 Exit (root) from <192.168.128.60:34304>: Disconnect received
Jan 05 18:46:48 homeassistant systemd[1]: Unmounting Supervisor bind mount: bind_Music...
Jan 05 18:46:48 homeassistant systemd[1]: mnt-data-supervisor-media-Music.mount: Deactivated successfully.
Jan 05 18:46:48 homeassistant systemd[1]: Unmounted Supervisor bind mount: bind_Music.
Jan 05 18:46:49 homeassistant systemd[1]: Unmounting Supervisor cifs mount: Music...
Jan 05 18:46:49 homeassistant systemd[1]: mnt-data-supervisor-mounts-Music.mount: Deactivated successfully.
Jan 05 18:46:49 homeassistant systemd[1]: Unmounted Supervisor cifs mount: Music.
Jan 05 18:46:50 homeassistant systemd[1]: Mounting Supervisor cifs mount: Music...
Jan 05 18:46:50 homeassistant kernel: CIFS: Attempting to mount \\192.168.128.6\BACKUPS
Jan 05 18:46:50 homeassistant mount[14847]: mount error(13): Permission denied
Jan 05 18:46:50 homeassistant mount[14847]: Refer to the mount.cifs(8) manual page (e.g. man mount.cifs) and kernel log messages (dmesg)
Jan 05 18:46:50 homeassistant kernel: CIFS: VFS: cifs_mount failed w/return code = -13
Jan 05 18:46:50 homeassistant systemd[1]: mnt-data-supervisor-mounts-Music.mount: Mount process exited, code=exited, status=32/n/a
Jan 05 18:46:50 homeassistant systemd[1]: mnt-data-supervisor-mounts-Music.mount: Failed with result 'exit-code'.
Jan 05 18:46:50 homeassistant systemd[1]: Failed to mount Supervisor cifs mount: Music.
Jan 05 18:47:18 homeassistant systemd[1]: Mounting Supervisor cifs mount: Music...
Jan 05 18:47:18 homeassistant kernel: CIFS: Attempting to mount \\192.168.128.6\BACKUPS
Jan 05 18:47:18 homeassistant mount[15105]: mount error(95): Operation not supported
Jan 05 18:47:18 homeassistant mount[15105]: Refer to the mount.cifs(8) manual page (e.g. man mount.cifs) and kernel log messages (dmesg)
Jan 05 18:47:18 homeassistant kernel: CIFS: VFS: \\192.168.128.6 Dialect not supported by server. Consider  specifying vers=1.0 or vers=2.0 on mount for accessing older servers
Jan 05 18:47:18 homeassistant kernel: CIFS: VFS: cifs_mount failed w/return code = -95
Jan 05 18:47:18 homeassistant systemd[1]: mnt-data-supervisor-mounts-Music.mount: Mount process exited, code=exited, status=32/n/a
Jan 05 18:47:18 homeassistant systemd[1]: mnt-data-supervisor-mounts-Music.mount: Failed with result 'exit-code'.
Jan 05 18:47:18 homeassistant systemd[1]: Failed to mount Supervisor cifs mount: Music.
Jan 05 18:47:32 homeassistant systemd[1]: Mounting Supervisor cifs mount: Music...
Jan 05 18:47:32 homeassistant kernel: CIFS: Attempting to mount \\192.168.128.6\BACKUPS
Jan 05 18:47:32 homeassistant systemd[1]: Mounted Supervisor cifs mount: Music.
Jan 05 18:47:42 homeassistant systemd[1]: Mounting Supervisor bind mount: bind_Music...
Jan 05 18:47:42 homeassistant systemd[1]: Mounted Supervisor bind mount: bind_Music.
Jan 05 18:48:00 homeassistant systemd[1]: Starting Cleanup of Temporary Directories...
Jan 05 18:48:00 homeassistant systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
Jan 05 18:48:00 homeassistant systemd[1]: Finished Cleanup of Temporary Directories.
Jan 05 18:48:00 homeassistant systemd[1]: run-credentials-systemd\x2dtmpfiles\x2dclean.service.mount: Deactivated successfully.
Jan 05 18:48:35 homeassistant dropbear[15812]: [15812] Jan 05 18:48:35 Child connection from 192.168.128.60:49770
jweston2112 commented 8 months ago

When I do a df in the terminal window I do get a familiar error message

image

I get a very similar message when I ssh into the underlying OS

image

Also the version of smbd is image

and the docs say it needs to be smb protocol SMB2 or Higher preferably SMB3

On My PI 4 which is running 11.3 / 2024.1.1 the df looks like

image

and the System Monitor integration works just fine

henriklund commented 8 months ago

Running core-2024.1.0. Upgrading OS to 11.3 triggered this behaviour (exemplified by SystemMonitor failing in my case) Shares also shows up as unavailable running df, however, they continue to work (tested by copying an existing file to a new name on HA side and performing a diff on the NAS side).

chma73 commented 8 months ago

hi @henriklund I have the also the issue that the SystemMonitor has no data after upgrading @gjohansson-ST means thats something different and it should open another issue, maybe you can start to open one

zanyraspi commented 8 months ago

+1, I am facing the same error.

Screenshot 2024-01-05 233714

It started after updating to HAOS 11.3 and Core 2024.1.1.

DunkyDaMonkey commented 8 months ago

+1 I had the same issue with 11.1 -> 11.3 HAOS update. along with a few other quirks.:

I didn't have a lot of time to debug so I just rolled back to 11.1 (last running operating system as I was away during the 11.2 release).

MaBeniu commented 8 months ago

+1. Getting samba storage issues after upgrade as well

gednet commented 8 months ago

Same here. As mentioned above - rollback to 11.2 resolve that issue.

sairon commented 8 months ago

Please, when adding comments like "+1" to this issue, also attach logs from HAOS and details about your setup (what type of installation are you running, where is Samba share hosted). So far we haven't been able to reproduce the issue and gathering more info is crucial for identifying the cause.

chma73 commented 8 months ago

hi @sairon just for double confirmation - the SMB issue is linked with the problem that the system monitor is not working and has no data? if yes here is my log data Logger: homeassistant.components.sensor Source: helpers/entity_platform.py:360 Integration: Sensor (documentation, issues) First occurred: 5. Januar 2024 um 19:24:49 (2 occurrences) Last logged: 11:07:55

Error while setting up systemmonitor platform for sensor Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/helpers/entity_platform.py", line 360, in _async_setup_platform await asyncio.shield(task) File "/usr/src/homeassistant/homeassistant/components/systemmonitor/sensor.py", line 393, in async_setup_entry disk_arguments = await hass.async_add_executor_job(get_all_disk_mounts) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/components/systemmonitor/util.py", line 22, in get_all_disk_mounts usage = psutil.disk_usage(part.mountpoint) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/psutil/init.py", line 2012, in disk_usage return _psplatform.disk_usage(path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/psutil/_psposix.py", line 177, in disk_usage st = os.statvfs(path) ^^^^^^^^^^^^^^^^ BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/DSEmma_Media'

sairon commented 8 months ago

@chma73 Yes, it's the same root cause.

@ovizii Tried the same setup - HAOS 11.3 generic-x86-64, upagge/samba container running on another host (latest image with smbd 4.16.8). The shares mount just fine :shrug:

Maybe there are more details that make the difference. Can you please check Samba logs on the remote host, confirm it's the same smbd version, ideally also share config you are using on the server?

sairon commented 8 months ago

@jweston2112 In your case the problem seems to be a bit different. Was it indeed triggered by the update and does downgrade help? In your logs I see these:

Jan 05 18:46:50 homeassistant kernel: CIFS: VFS: cifs_mount failed w/return code = -13
Jan 05 18:47:18 homeassistant kernel: CIFS: VFS: cifs_mount failed w/return code = -95

The first one is return code for "permission denied", which CIFS client returns when incorrect credentials are supplied. The second one is "operation not supported", this one is instead returned when you select incorrect protocol version. Both of these suggest rather some misconfiguration, are you sure you haven't done any changes in your shares' configuration in the meantime?

jweston2112 commented 8 months ago

@sairon Oh Yes I meant to mention that.... I have removed and re added in the storage app multiple times the Shares.... I found that when you select CIFS it doesnt give you the choice for the protocol till it fails once.... unless I am doing something else wrong... so it failed because I didnt initially have the option to choose the Auto (2.1+) option like I said up above the Docker container I am using specifically says the protocol needs to be higher then 2 and that SMB3 is preferred

as you can see below the errors the share mounted just fine in the end... I think its the OS repoorts this its temporarily unavailable as a Samba share instead of the information about the share that is throwing SystemMonitor for a Loop

ovizii commented 8 months ago

@ovizii Tried the same setup - HAOS 11.3 generic-x86-64, upagge/samba container running on another host (latest image with smbd 4.16.8). The shares mount just fine 🤷

My shares also mount just fine, please check my problem again:

When Frigate runs shutil.disk_usage(path) it results in this error:

2024-01-05 11:49:24.300409714 BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/frigate/recordings'

This looks like just the checking of resource usage fails, hence frigate reports the share as unavailable.

gednet commented 8 months ago

@sairon - can you look on this? https://github.com/blakeblackshear/frigate/issues/2943#issuecomment-1878738114

2024-01-05 11:49:24.300409714 BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/frigate/recordings'

This looks like just the checking of resource usage fails, hence frigate reports the share as unavailable.

If @ovizii have right that will be helpful and probably it's a trace to solution. I can't check this because a have production system.

henriklund commented 8 months ago

hi @henriklund I have the also the issue that the SystemMonitor has no data after upgrading @gjohansson-ST means thats something different and it should open another issue, maybe you can start to open one

I do no see the SystemMonitor fault as anything but the symptom. df -k | grep unavailable returns df: /share/NAS_share: Resource temporarily unavailable. However, all operations on that share (copy / create / delete etc.) work as expected. On the NAS side the CIFS (SMB3) connection is logged as completed without error and all changes registered.

DunkyDaMonkey commented 8 months ago

Please, when adding comments like "+1" to this issue, also attach logs from HAOS and details about your setup (what type of installation are you running, where is Samba share hosted). So far we haven't been able to reproduce the issue and gathering more info is crucial for identifying the cause.

Sorry I should have known better.

I've rolled forward again and got the same issue here's the details to help with recreation.

Version: Note I rolled forward HAOS 11.1 -> 11.3 (this is the only change that I made that resulted in the issue. image

Frigate version 0.12.1 Frigate integration version 4.0.1 Frigate uses an externally mounted NAS for storage (CIFS) (as does backups, recorder DB) image

Install:

Problem:

This error originated from a custom integration.

Logger: custom_components.frigate.api
Source: custom_components/frigate/api.py:227
Integration: Frigate (documentation, issues)
First occurred: 10:21:38 AM (6 occurrences)
Last logged: 10:25:06 AM

Error fetching information from http://ccab4aaf-frigate:5000/api/stats: Cannot connect to host ccab4aaf-frigate:5000 ssl:default [Name has no usable address]
Error fetching information from http://ccab4aaf-frigate:5000/api/stats: 500, message='INTERNAL SERVER ERROR', url=URL('http://ccab4aaf-frigate:5000/api/stats')

frigate is unable to setup image

Logger: homeassistant.components.sensor
Source: helpers/entity_platform.py:360
Integration: Sensor (documentation, issues)
First occurred: 10:21:32 AM (1 occurrences)
Last logged: 10:21:32 AM

Error while setting up systemmonitor platform for sensor
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity_platform.py", line 360, in _async_setup_platform
    await asyncio.shield(task)
  File "/usr/src/homeassistant/homeassistant/components/systemmonitor/sensor.py", line 393, in async_setup_entry
    disk_arguments = await hass.async_add_executor_job(get_all_disk_mounts)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/systemmonitor/util.py", line 22, in get_all_disk_mounts
    usage = psutil.disk_usage(part.mountpoint)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/psutil/__init__.py", line 2012, in disk_usage
    return _psplatform.disk_usage(path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/psutil/_psposix.py", line 177, in disk_usage
    st = os.statvfs(path)
         ^^^^^^^^^^^^^^^^
BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/Surveillance'

System monitor shows it as loaded but all values are unavailable image

Rolling back to 11.2 this time

chma73 commented 8 months ago

hi @DunkyDaMonkey I'm meanwhile also sure that I got the issue with the OS update 11.2 to 11.3; how did you the rolling back to 11.2 (sorry that I'm asking it's the first time I'm doing an OS rolling back).

DunkyDaMonkey commented 8 months ago

hi @DunkyDaMonkey I'm meanwhile also sure that I got the issue with the OS update 11.2 to 11.3; how did you the rolling back to 11.2 (sorry that I'm asking it's the first time I'm doing an OS rolling back).

no issues with rolling back to 11.2 instead of 11.1. (ha os update --version 11.1 is the command if you need it)

I've also updrated to 2024.1.2 core no issues.

chma73 commented 8 months ago

@DunkyDaMonkey thanks for your support, rolling back to OS 11.2 solved also for me the issue, system monitor data is back core is 2024.1.2 > the os 10.3 cause the problem

remb0 commented 8 months ago

I have the same with cifs/smb from synology, so I changed to NFS without a problem. All my addons can use the share again and I see no differences. I actually don't know the pros and cons between them, so for now that work around works perfect.

lbschenkel commented 8 months ago

Hi, I found this by accident while Googling the same problem (but I had it with my desktop, I didn't even know there was a problem with HASS too).

After doing a lot of troubleshooting involving changing the configuration in the Samba server and the mount flags in the client, I found out that nothing worked and the problem is entirely at the client, with kernel version 6.1. I was using Arch and the LTS kernel (6.1.71) and any stat* system call on a CIFS share (reproducible with df, ls -l, etc.) would result in Resource temporarily unavailable. If I just boot with the most recent kernel (6.6.9 at the moment I write this) without changing anything, these errors go away.

I did the same test in the root shell of HassOS and I'm getting the same errors when I mount the same CIFS share that had the same issue in my desktop, but no longer does after upgrading the kernel.

HassOS is also using 6.1, and the fact that the exact same error is happening with CIFS shares is probably is not a coincidence. I am in this LTS for some time (months) and I'm sure this wasn't happening before, so there must be some regression introduced in recent versions of 6.1.x.

In a nutshell:

lbschenkel commented 8 months ago

I did more experiments inside a VM. I "bisected" the LTS kernel versions to see which one started having this error with CIFS shares. It's exactly version 6.1.70 which is being used by HassOS. Newest LTS 6.1.71 is also affected.

This is the changelog: https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.1.70

There were a few CIFS changes in there to fix other bugs, but I'm guessing that some of those broke the LTS kernel (perhaps some other change that should have been backported wasn't?).

If you need working CIFS right now, the best "solution" is to downgrade HassOS to use kernel 6.1.69 or older.

sairon commented 8 months ago

Thanks everyone (and thanks @lbschenkel especially for the extensive report and testing), I managed to reproduce the issue and started bisecting the linux-stable tree this morning, which pointed me to the specific commit which introduced the issue: https://github.com/gregkh/linux/commit/bef4315f19ba6f434054f58b958c0cf058c7a43f. Since it's been backported to 6.6.9 as well which doesn't show the regression, there's likely something different in 6.1.x tree. I'll look a bit more deeper and report it in appropriate mailing lists.

In the meantime, before the kernel issue is resolved, easiest solution is to downgrade to 11.3.rc1 (ha os update --version 11.3.rc1) which is missing only a few minor HAOS changes from the latest stable (and uses Linux 6.1.69).

lbschenkel commented 8 months ago

Thanks everyone (and thanks @lbschenkel especially for the extensive report and testing), I managed to reproduce the issue and started bisecting the linux-stable tree this morning, which pointed me to the specific commit which introduced the issue: gregkh/linux@bef4315. Since it's been backported to 6.6.9 as well which doesn't show the regression, there's likely something different in 6.1.x tree. I'll look a bit more deeper and report it in appropriate mailing lists.

FYI: I already reported it to the stable mailing list (https://lore.kernel.org/stable/8ad7c20e-0645-40f3-96e6-75257b4bd31a@schenkel.net/), I think it would be nice if you used the same thread.

henriklund commented 8 months ago

In the meantime, before the kernel issue is resolved, easiest solution is to downgrade to 11.3.rc1 (ha os update --version 11.3.rc1) which is missing only a few minor HAOS changes from the latest stable (and uses Linux 6.1.69).

Would it be wise to pull the v11.3 release?

lbschenkel commented 8 months ago

In the meantime, before the kernel issue is resolved, easiest solution is to downgrade to 11.3.rc1 (ha os update --version 11.3.rc1) which is missing only a few minor HAOS changes from the latest stable (and uses Linux 6.1.69).

Would it be wise to pull the v11.3 release?

I would suggest a 11.3.1 downgrading to kernel 1.6.69. Not ideal to be downgrading kernel and losing potential bug (and security) fixes, but on the other HassOS is always behind latest stable kernel given their frequent releases.

sairon commented 8 months ago

Reverting the offending patch seems like a slightly better solution. 11.4.rc1 should be available this afternoon/evening (CET), 11.4 likely tomorrow.

sairon commented 8 months ago

HA OS 11.4 is out with the problem resolved. I'll keep this issue opened until a proper fix lands in the next 6.1 kernel release.

HAREGIS commented 8 months ago

Frigate still not working with 11.4 { "request": { "type": "frigate/events/get", "instance_id": "frigate", "cameras": [ "Camera_jardin" ], "limit": 50, "id": 103 }, "response": { "code": "not_found", "message": "Unable to find Frigate instance with ID: frigate"

DunkyDaMonkey commented 8 months ago

Frigate still not working with 11.4 { "request": { "type": "frigate/events/get", "instance_id": "frigate", "cameras": [ "Camera_jardin" ], "limit": 50, "id": 103 }, "response": { "code": "not_found", "message": "Unable to find Frigate instance with ID: frigate"

I think you might have a different issue. I've rolled forward to 10.4 and frigate, system monitor and shares are all working for me.

sairon commented 8 months ago

@HAREGIS This seems unrelated to the original issue - even though the problem might have been a trigger for the situation, please report this in the Frigate issue tracker.

blackie333 commented 8 months ago

I'm sorry to report that CIFS re-connection message still appears on my Odroid-C4 system after upgrade to HA OS 11.4 & reboot. Manual repair of the connection always works but only for some time. "dmesg | grep CIFS" returns nothing for me P.S. I don't use Frigate just remote Backup folder on my NAS

sairon commented 8 months ago

@blackie333 dmesg | grep CIFS should always return anything, otherwise the mount wouldn't exist at all. What do you mean by "manual repair of the connection"? Can you provide at least some logs?

sairon commented 8 months ago

The issue is supposed to be fixed through #3064 for the upcoming releases, latest stable release 11.4 contains a workaround for the problem. If you encounter any problems, please create a new issue with all required logs.