Closed ovizii closed 8 months ago
Please provide correct logs - i.e. Host and Supervisor ones. They should hopefully contain something useful. Unfortunately, I am not able to reproduce the issue here with CIFS shares so there has to be something specific in your environment.
I'll give it a try, here are logs from
HOST:
Jan 05 16:50:23 ha-ct systemd[1]: run-docker-runtime\x2drunc-moby-f0c19d2f5ab43f466ccfb03620fe4d78dfdf274853b9a4af9caffb783027717c-runc.eAgyBS.mount: Deactivated successfully.
Jan 05 16:50:26 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:26 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:26 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:27 ha-ct kernel: CIFS: VFS: reconnect tcon failed rc = -11
Jan 05 16:50:30 ha-ct kernel: CIFS: VFS: \\192.168.98.2 Send error in SessSetup = -11
Jan 05 16:50:53 ha-ct systemd[1]: run-docker-runtime\x2drunc-moby-f0c19d2f5ab43f466ccfb03620fe4d78dfdf274853b9a4af9caffb783027717c-runc.jhgq7y.mount: Deactivated successfully.
Jan 05 16:50:58 ha-ct kernel: CIFS: VFS: \\192.168.98.2 Send error in SessSetup = -11
Jan 05 16:51:23 ha-ct systemd[1]: run-docker-runtime\x2drunc-moby-
SUPERVISOR does not seem to contain anything related.
The return code is EAGAIN
(= something tells "Try again"), but can't find where it's coming from. There might be some more lines that could have more hints in previous logs, as the first line when a share is mounted is CIFS: Attempting to mount \\...
. Could you (or anybody else) provide the whole dmesg | grep CIFS
output?
Also, what is the SMB host? Can you get any logs from that when this happens?
Here's my try. The output of dmesg | grep CIFS | more
The rest of the logs are all the same, no other entries. Sadly I have minimized samba logs. Would need to up the log level and restart. Or was it lower the log level? hm... Lets see if anyone else can contribute first.
My Samba is this docker container running on another host: https://github.com/uPagge/samba
My Attempt and what looks like Pertinent info from the HOST Log
an 05 18:46:24 homeassistant dropbear[11374]: [11374] Jan 05 18:46:24 Exit (root) from <192.168.128.60:34304>: Disconnect received
Jan 05 18:46:48 homeassistant systemd[1]: Unmounting Supervisor bind mount: bind_Music...
Jan 05 18:46:48 homeassistant systemd[1]: mnt-data-supervisor-media-Music.mount: Deactivated successfully.
Jan 05 18:46:48 homeassistant systemd[1]: Unmounted Supervisor bind mount: bind_Music.
Jan 05 18:46:49 homeassistant systemd[1]: Unmounting Supervisor cifs mount: Music...
Jan 05 18:46:49 homeassistant systemd[1]: mnt-data-supervisor-mounts-Music.mount: Deactivated successfully.
Jan 05 18:46:49 homeassistant systemd[1]: Unmounted Supervisor cifs mount: Music.
Jan 05 18:46:50 homeassistant systemd[1]: Mounting Supervisor cifs mount: Music...
Jan 05 18:46:50 homeassistant kernel: CIFS: Attempting to mount \\192.168.128.6\BACKUPS
Jan 05 18:46:50 homeassistant mount[14847]: mount error(13): Permission denied
Jan 05 18:46:50 homeassistant mount[14847]: Refer to the mount.cifs(8) manual page (e.g. man mount.cifs) and kernel log messages (dmesg)
Jan 05 18:46:50 homeassistant kernel: CIFS: VFS: cifs_mount failed w/return code = -13
Jan 05 18:46:50 homeassistant systemd[1]: mnt-data-supervisor-mounts-Music.mount: Mount process exited, code=exited, status=32/n/a
Jan 05 18:46:50 homeassistant systemd[1]: mnt-data-supervisor-mounts-Music.mount: Failed with result 'exit-code'.
Jan 05 18:46:50 homeassistant systemd[1]: Failed to mount Supervisor cifs mount: Music.
Jan 05 18:47:18 homeassistant systemd[1]: Mounting Supervisor cifs mount: Music...
Jan 05 18:47:18 homeassistant kernel: CIFS: Attempting to mount \\192.168.128.6\BACKUPS
Jan 05 18:47:18 homeassistant mount[15105]: mount error(95): Operation not supported
Jan 05 18:47:18 homeassistant mount[15105]: Refer to the mount.cifs(8) manual page (e.g. man mount.cifs) and kernel log messages (dmesg)
Jan 05 18:47:18 homeassistant kernel: CIFS: VFS: \\192.168.128.6 Dialect not supported by server. Consider specifying vers=1.0 or vers=2.0 on mount for accessing older servers
Jan 05 18:47:18 homeassistant kernel: CIFS: VFS: cifs_mount failed w/return code = -95
Jan 05 18:47:18 homeassistant systemd[1]: mnt-data-supervisor-mounts-Music.mount: Mount process exited, code=exited, status=32/n/a
Jan 05 18:47:18 homeassistant systemd[1]: mnt-data-supervisor-mounts-Music.mount: Failed with result 'exit-code'.
Jan 05 18:47:18 homeassistant systemd[1]: Failed to mount Supervisor cifs mount: Music.
Jan 05 18:47:32 homeassistant systemd[1]: Mounting Supervisor cifs mount: Music...
Jan 05 18:47:32 homeassistant kernel: CIFS: Attempting to mount \\192.168.128.6\BACKUPS
Jan 05 18:47:32 homeassistant systemd[1]: Mounted Supervisor cifs mount: Music.
Jan 05 18:47:42 homeassistant systemd[1]: Mounting Supervisor bind mount: bind_Music...
Jan 05 18:47:42 homeassistant systemd[1]: Mounted Supervisor bind mount: bind_Music.
Jan 05 18:48:00 homeassistant systemd[1]: Starting Cleanup of Temporary Directories...
Jan 05 18:48:00 homeassistant systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
Jan 05 18:48:00 homeassistant systemd[1]: Finished Cleanup of Temporary Directories.
Jan 05 18:48:00 homeassistant systemd[1]: run-credentials-systemd\x2dtmpfiles\x2dclean.service.mount: Deactivated successfully.
Jan 05 18:48:35 homeassistant dropbear[15812]: [15812] Jan 05 18:48:35 Child connection from 192.168.128.60:49770
When I do a df in the terminal window I do get a familiar error message
I get a very similar message when I ssh into the underlying OS
Also the version of smbd is
and the docs say it needs to be smb protocol SMB2 or Higher preferably SMB3
On My PI 4 which is running 11.3 / 2024.1.1 the df looks like
and the System Monitor integration works just fine
Running core-2024.1.0.
Upgrading OS to 11.3 triggered this behaviour (exemplified by SystemMonitor failing in my case)
Shares also shows up as unavailable running df
, however, they continue to work (tested by copying an existing file to a new name on HA side and performing a diff
on the NAS side).
hi @henriklund I have the also the issue that the SystemMonitor has no data after upgrading @gjohansson-ST means thats something different and it should open another issue, maybe you can start to open one
+1, I am facing the same error.
It started after updating to HAOS 11.3 and Core 2024.1.1.
+1 I had the same issue with 11.1 -> 11.3 HAOS update. along with a few other quirks.:
I didn't have a lot of time to debug so I just rolled back to 11.1 (last running operating system as I was away during the 11.2 release).
+1. Getting samba storage issues after upgrade as well
Same here. As mentioned above - rollback to 11.2 resolve that issue.
Please, when adding comments like "+1" to this issue, also attach logs from HAOS and details about your setup (what type of installation are you running, where is Samba share hosted). So far we haven't been able to reproduce the issue and gathering more info is crucial for identifying the cause.
hi @sairon just for double confirmation - the SMB issue is linked with the problem that the system monitor is not working and has no data? if yes here is my log data Logger: homeassistant.components.sensor Source: helpers/entity_platform.py:360 Integration: Sensor (documentation, issues) First occurred: 5. Januar 2024 um 19:24:49 (2 occurrences) Last logged: 11:07:55
Error while setting up systemmonitor platform for sensor Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/helpers/entity_platform.py", line 360, in _async_setup_platform await asyncio.shield(task) File "/usr/src/homeassistant/homeassistant/components/systemmonitor/sensor.py", line 393, in async_setup_entry disk_arguments = await hass.async_add_executor_job(get_all_disk_mounts) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/components/systemmonitor/util.py", line 22, in get_all_disk_mounts usage = psutil.disk_usage(part.mountpoint) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/psutil/init.py", line 2012, in disk_usage return _psplatform.disk_usage(path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/psutil/_psposix.py", line 177, in disk_usage st = os.statvfs(path) ^^^^^^^^^^^^^^^^ BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/DSEmma_Media'
@chma73 Yes, it's the same root cause.
@ovizii Tried the same setup - HAOS 11.3 generic-x86-64, upagge/samba
container running on another host (latest image with smbd 4.16.8). The shares mount just fine :shrug:
Maybe there are more details that make the difference. Can you please check Samba logs on the remote host, confirm it's the same smbd version, ideally also share config you are using on the server?
@jweston2112 In your case the problem seems to be a bit different. Was it indeed triggered by the update and does downgrade help? In your logs I see these:
Jan 05 18:46:50 homeassistant kernel: CIFS: VFS: cifs_mount failed w/return code = -13
Jan 05 18:47:18 homeassistant kernel: CIFS: VFS: cifs_mount failed w/return code = -95
The first one is return code for "permission denied", which CIFS client returns when incorrect credentials are supplied. The second one is "operation not supported", this one is instead returned when you select incorrect protocol version. Both of these suggest rather some misconfiguration, are you sure you haven't done any changes in your shares' configuration in the meantime?
@sairon Oh Yes I meant to mention that.... I have removed and re added in the storage app multiple times the Shares.... I found that when you select CIFS it doesnt give you the choice for the protocol till it fails once.... unless I am doing something else wrong... so it failed because I didnt initially have the option to choose the Auto (2.1+) option like I said up above the Docker container I am using specifically says the protocol needs to be higher then 2 and that SMB3 is preferred
as you can see below the errors the share mounted just fine in the end... I think its the OS repoorts this its temporarily unavailable as a Samba share instead of the information about the share that is throwing SystemMonitor for a Loop
@ovizii Tried the same setup - HAOS 11.3 generic-x86-64,
upagge/samba
container running on another host (latest image with smbd 4.16.8). The shares mount just fine 🤷
My shares also mount just fine, please check my problem again:
When Frigate runs shutil.disk_usage(path) it results in this error:
2024-01-05 11:49:24.300409714 BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/frigate/recordings'
This looks like just the checking of resource usage fails, hence frigate reports the share as unavailable.
@sairon - can you look on this? https://github.com/blakeblackshear/frigate/issues/2943#issuecomment-1878738114
2024-01-05 11:49:24.300409714 BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/frigate/recordings'
This looks like just the checking of resource usage fails, hence frigate reports the share as unavailable.
If @ovizii have right that will be helpful and probably it's a trace to solution. I can't check this because a have production system.
hi @henriklund I have the also the issue that the SystemMonitor has no data after upgrading @gjohansson-ST means thats something different and it should open another issue, maybe you can start to open one
I do no see the SystemMonitor fault as anything but the symptom. df -k | grep unavailable
returns df: /share/NAS_share: Resource temporarily unavailable
. However, all operations on that share (copy / create / delete etc.) work as expected. On the NAS side the CIFS (SMB3) connection is logged as completed without error and all changes registered.
Please, when adding comments like "+1" to this issue, also attach logs from HAOS and details about your setup (what type of installation are you running, where is Samba share hosted). So far we haven't been able to reproduce the issue and gathering more info is crucial for identifying the cause.
Sorry I should have known better.
I've rolled forward again and got the same issue here's the details to help with recreation.
Version: Note I rolled forward HAOS 11.1 -> 11.3 (this is the only change that I made that resulted in the issue.
Frigate version 0.12.1 Frigate integration version 4.0.1 Frigate uses an externally mounted NAS for storage (CIFS) (as does backups, recorder DB)
Install:
Problem:
This error originated from a custom integration.
Logger: custom_components.frigate.api
Source: custom_components/frigate/api.py:227
Integration: Frigate (documentation, issues)
First occurred: 10:21:38 AM (6 occurrences)
Last logged: 10:25:06 AM
Error fetching information from http://ccab4aaf-frigate:5000/api/stats: Cannot connect to host ccab4aaf-frigate:5000 ssl:default [Name has no usable address]
Error fetching information from http://ccab4aaf-frigate:5000/api/stats: 500, message='INTERNAL SERVER ERROR', url=URL('http://ccab4aaf-frigate:5000/api/stats')
frigate is unable to setup
Logger: homeassistant.components.sensor
Source: helpers/entity_platform.py:360
Integration: Sensor (documentation, issues)
First occurred: 10:21:32 AM (1 occurrences)
Last logged: 10:21:32 AM
Error while setting up systemmonitor platform for sensor
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/helpers/entity_platform.py", line 360, in _async_setup_platform
await asyncio.shield(task)
File "/usr/src/homeassistant/homeassistant/components/systemmonitor/sensor.py", line 393, in async_setup_entry
disk_arguments = await hass.async_add_executor_job(get_all_disk_mounts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/homeassistant/homeassistant/components/systemmonitor/util.py", line 22, in get_all_disk_mounts
usage = psutil.disk_usage(part.mountpoint)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/psutil/__init__.py", line 2012, in disk_usage
return _psplatform.disk_usage(path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/psutil/_psposix.py", line 177, in disk_usage
st = os.statvfs(path)
^^^^^^^^^^^^^^^^
BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/Surveillance'
System monitor shows it as loaded but all values are unavailable
Rolling back to 11.2 this time
hi @DunkyDaMonkey I'm meanwhile also sure that I got the issue with the OS update 11.2 to 11.3; how did you the rolling back to 11.2 (sorry that I'm asking it's the first time I'm doing an OS rolling back).
hi @DunkyDaMonkey I'm meanwhile also sure that I got the issue with the OS update 11.2 to 11.3; how did you the rolling back to 11.2 (sorry that I'm asking it's the first time I'm doing an OS rolling back).
no issues with rolling back to 11.2 instead of 11.1. (ha os update --version 11.1 is the command if you need it)
I've also updrated to 2024.1.2 core no issues.
@DunkyDaMonkey thanks for your support, rolling back to OS 11.2 solved also for me the issue, system monitor data is back core is 2024.1.2 > the os 10.3 cause the problem
I have the same with cifs/smb from synology, so I changed to NFS without a problem. All my addons can use the share again and I see no differences. I actually don't know the pros and cons between them, so for now that work around works perfect.
Hi, I found this by accident while Googling the same problem (but I had it with my desktop, I didn't even know there was a problem with HASS too).
After doing a lot of troubleshooting involving changing the configuration in the Samba server and the mount flags in the client, I found out that nothing worked and the problem is entirely at the client, with kernel version 6.1. I was using Arch and the LTS kernel (6.1.71) and any stat*
system call on a CIFS share (reproducible with df
, ls -l
, etc.) would result in Resource temporarily unavailable
. If I just boot with the most recent kernel (6.6.9 at the moment I write this) without changing anything, these errors go away.
I did the same test in the root shell of HassOS and I'm getting the same errors when I mount the same CIFS share that had the same issue in my desktop, but no longer does after upgrading the kernel.
HassOS is also using 6.1, and the fact that the exact same error is happening with CIFS shares is probably is not a coincidence. I am in this LTS for some time (months) and I'm sure this wasn't happening before, so there must be some regression introduced in recent versions of 6.1.x.
In a nutshell:
I did more experiments inside a VM. I "bisected" the LTS kernel versions to see which one started having this error with CIFS shares. It's exactly version 6.1.70 which is being used by HassOS. Newest LTS 6.1.71 is also affected.
This is the changelog: https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.1.70
There were a few CIFS changes in there to fix other bugs, but I'm guessing that some of those broke the LTS kernel (perhaps some other change that should have been backported wasn't?).
If you need working CIFS right now, the best "solution" is to downgrade HassOS to use kernel 6.1.69 or older.
Thanks everyone (and thanks @lbschenkel especially for the extensive report and testing), I managed to reproduce the issue and started bisecting the linux-stable tree this morning, which pointed me to the specific commit which introduced the issue: https://github.com/gregkh/linux/commit/bef4315f19ba6f434054f58b958c0cf058c7a43f. Since it's been backported to 6.6.9 as well which doesn't show the regression, there's likely something different in 6.1.x tree. I'll look a bit more deeper and report it in appropriate mailing lists.
In the meantime, before the kernel issue is resolved, easiest solution is to downgrade to 11.3.rc1 (ha os update --version 11.3.rc1
) which is missing only a few minor HAOS changes from the latest stable (and uses Linux 6.1.69).
Thanks everyone (and thanks @lbschenkel especially for the extensive report and testing), I managed to reproduce the issue and started bisecting the linux-stable tree this morning, which pointed me to the specific commit which introduced the issue: gregkh/linux@bef4315. Since it's been backported to 6.6.9 as well which doesn't show the regression, there's likely something different in 6.1.x tree. I'll look a bit more deeper and report it in appropriate mailing lists.
FYI: I already reported it to the stable mailing list (https://lore.kernel.org/stable/8ad7c20e-0645-40f3-96e6-75257b4bd31a@schenkel.net/), I think it would be nice if you used the same thread.
In the meantime, before the kernel issue is resolved, easiest solution is to downgrade to 11.3.rc1 (
ha os update --version 11.3.rc1
) which is missing only a few minor HAOS changes from the latest stable (and uses Linux 6.1.69).
Would it be wise to pull the v11.3 release?
In the meantime, before the kernel issue is resolved, easiest solution is to downgrade to 11.3.rc1 (
ha os update --version 11.3.rc1
) which is missing only a few minor HAOS changes from the latest stable (and uses Linux 6.1.69).Would it be wise to pull the v11.3 release?
I would suggest a 11.3.1
downgrading to kernel 1.6.69
. Not ideal to be downgrading kernel and losing potential bug (and security) fixes, but on the other HassOS is always behind latest stable kernel given their frequent releases.
Reverting the offending patch seems like a slightly better solution. 11.4.rc1 should be available this afternoon/evening (CET), 11.4 likely tomorrow.
HA OS 11.4 is out with the problem resolved. I'll keep this issue opened until a proper fix lands in the next 6.1 kernel release.
Frigate still not working with 11.4 { "request": { "type": "frigate/events/get", "instance_id": "frigate", "cameras": [ "Camera_jardin" ], "limit": 50, "id": 103 }, "response": { "code": "not_found", "message": "Unable to find Frigate instance with ID: frigate"
Frigate still not working with 11.4 { "request": { "type": "frigate/events/get", "instance_id": "frigate", "cameras": [ "Camera_jardin" ], "limit": 50, "id": 103 }, "response": { "code": "not_found", "message": "Unable to find Frigate instance with ID: frigate"
I think you might have a different issue. I've rolled forward to 10.4 and frigate, system monitor and shares are all working for me.
@HAREGIS This seems unrelated to the original issue - even though the problem might have been a trigger for the situation, please report this in the Frigate issue tracker.
I'm sorry to report that CIFS re-connection message still appears on my Odroid-C4 system after upgrade to HA OS 11.4 & reboot. Manual repair of the connection always works but only for some time. "dmesg | grep CIFS" returns nothing for me P.S. I don't use Frigate just remote Backup folder on my NAS
@blackie333 dmesg | grep CIFS
should always return anything, otherwise the mount wouldn't exist at all. What do you mean by "manual repair of the connection"? Can you provide at least some logs?
The issue is supposed to be fixed through #3064 for the upcoming releases, latest stable release 11.4 contains a workaround for the problem. If you encounter any problems, please create a new issue with all required logs.
Describe the issue you are experiencing
I am using:
HA OS 11.3 HA Core 2024.1.0 HA Supervisor 2023.12.0
When Frigate runs
shutil.disk_usage(path)
it results in this error:2024-01-05 11:49:24.300409714 BlockingIOError: [Errno 11] Resource temporarily unavailable: '/media/frigate/recordings'
What operating system image do you use?
generic-x86-64 (Generic UEFI capable x86-64 systems)
What version of Home Assistant Operating System is installed?
11.3
Did you upgrade the Operating System.
Yes
Steps to reproduce the issue
Anything in the Supervisor logs that might be useful for us?
Anything in the Host logs that might be useful for us?
System information
System Information
Home Assistant Community Store
GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 5000 Installed Version | 1.33.0 Stage | running Available Repositories | 1366 Downloaded Repositories | 10Home Assistant Cloud
logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | okHome Assistant Supervisor
host_os | Home Assistant OS 11.3 -- | -- update_channel | stable supervisor_version | supervisor-2023.12.0 agent_version | 1.6.0 docker_version | 24.0.7 disk_total | 48.5 GB disk_used | 12.8 GB healthy | true supported | true board | ova supervisor_api | ok version_api | ok installed_addons | Check Home Assistant configuration (3.11.0), Log Viewer (0.16.0), Tailscale (1.56.1.1), Samba Backup (5.2.0), Advanced SSH & Web Terminal (17.0.1), Squeezelite (0.0.17), Mosquitto broker (6.4.0), Zigbee2MQTT (1.35.0-1), Studio Code Server (5.14.2), Frigate (Full Access) (0.12.1), Music Assistant BETA (2.0.0b80)Dashboards
dashboards | 1 -- | -- resources | 3 views | 1 mode | storageRecorder
oldest_recorder_run | 31 December 2023 at 15:10 -- | -- current_recorder_run | 5 January 2024 at 12:48 estimated_db_size | 57.61 MiB database_engine | sqlite database_version | 3.41.2Additional information
No response