home-assistant / operating-system

:beginner: Home Assistant Operating System
Apache License 2.0
4.79k stars 959 forks source link

Detecting HardwareAction.REMOVE/ADD hardware /dev/bus/usb/002/XXX spamming logs #1633

Closed mbo18 closed 2 years ago

mbo18 commented 2 years ago

Hardware Environment

Home Assistant OS release:

agners commented 2 years ago

It seems Linux has problems enumerating a particular device. error -71 typically means it has problems communicating on a hardware level. Can you unplug USB devices to isolate which one it is? Maybe replace USB cables/Hubs if necessary.

mbo18 commented 2 years ago

Can it be an internal device? All I currently have is a Conbee 2 (/dev/bus/usb/002/004), a RFLink (/dev/bus/usb/002/003) and a UPS (/dev/bus/usb/002/002). All are working fine.

agners commented 2 years ago

One of the problematic devices seems to be usb 2-7. At one point it enumerates: usb 2-7: New USB device found, idVendor=8087, idProduct=0a2a, bcdDevice= 0.01. The vendor/product ID indicates that this is a Intel Bluetooth device: https://devicehunt.com/view/type/usb/vendor/8087/device/0A2A

mbo18 commented 2 years ago

Ok so it is my NUC internal Bluetooth module as I did not add a USB Bluetooth module myself. What is strange is that this spamming seems to be here only since I updated to 6.6

agners commented 2 years ago

Can you try downgrading to see if its really OS version related? Use the Terminal Add-on and the following command:

ha os update --version 6.5
jle1n0 commented 2 years ago

I lost ConBee 2 stick with my NUC platform when upgrading to version 6.6. It just complained about the missing serial device. Downgrading back to 6.5 made the system work again. With dmesg I could see the stick being recognized at least to some degree...

(I'm running a supervised setup with Home Assistant OS.)

agners commented 2 years ago

@jle1n0 do you get continuously ADD and REMOVE hardware log entries? If not, this is unlikely related to this issue and your problem requires a separate new issue.

mbo18 commented 2 years ago

Can you try downgrading to see if its really OS version related? Use the Terminal Add-on and the following command:

ha os update --version 6.5

I have downgraded to 6.5 and logs are still there so not related to 6.6

agners commented 2 years ago

I'd suspect hardware here. If you don't use the module, maybe simply removing it (if its removable :sweat_smile: ). Maybe testing with another distribution might be worth a try.

gieljnssns commented 2 years ago

I'm also having problems with an Intel NUC (NUC8i7BeH) and a Conbee 2. From time to time when restarting HA (since I use the conbee)

Logger: zigpy.application
Source: /usr/local/lib/python3.9/site-packages/zigpy/application.py:73
First occurred: 08:48:43 (1 occurrences)
Last logged: 08:48:43

Couldn't start application
Logger: homeassistant.components.zha.core.gateway
Source: components/zha/core/gateway.py:152
Integration: Zigbee Home Automation (documentation, issues)
First occurred: 08:48:43 (1 occurrences)
Last logged: 08:48:43

Couldn't start deCONZ = dresden elektronik deCONZ protocol: ConBee I/II, RaspBee I/II coordinator
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/asyncio/tasks.py", line 492, in wait_for
    fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 152, in async_initialize
    self.application_controller = await app_controller_cls.new(
  File "/usr/local/lib/python3.9/site-packages/zigpy/application.py", line 69, in new
    await app.startup(auto_form)
  File "/usr/local/lib/python3.9/site-packages/zigpy_deconz/zigbee/application.py", line 67, in startup
    self.version = await self._api.version()
  File "/usr/local/lib/python3.9/site-packages/zigpy_deconz/api.py", line 455, in version
    (self._proto_ver,) = await self[NetworkParameter.protocol_version]
  File "/usr/local/lib/python3.9/site-packages/zigpy_deconz/api.py", line 420, in read_parameter
    r = await self._command(Command.read_parameter, 1 + len(data), param, data)
  File "/usr/local/lib/python3.9/site-packages/zigpy_deconz/api.py", line 305, in _command
    return await asyncio.wait_for(fut, timeout=COMMAND_TIMEOUT)
  File "/usr/local/lib/python3.9/asyncio/tasks.py", line 494, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

Supervisor logs:

21-11-16 08:48:08 INFO (SyncWorker_9) [supervisor.docker.interface] Restarting ghcr.io/home-assistant/generic-x86-64-homeassistant
21-11-16 08:48:23 INFO (MainThread) [supervisor.homeassistant.core] Wait until Home Assistant is ready
21-11-16 08:48:27 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.RUNNING
21-11-16 08:48:27 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
21-11-16 08:48:40 INFO (MainThread) [supervisor.hardware.monitor] Detecting HardwareAction.REMOVE hardware /dev/ttyACM0 - /dev/serial/by-id/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2215018-if00
21-11-16 08:48:40 INFO (MainThread) [supervisor.hardware.monitor] Detecting HardwareAction.REMOVE hardware /dev/bus/usb/001/050 - None
21-11-16 08:48:43 INFO (MainThread) [supervisor.hardware.monitor] Detecting HardwareAction.ADD hardware /dev/bus/usb/001/051 - None
21-11-16 08:48:43 INFO (MainThread) [supervisor.hardware.monitor] Detecting HardwareAction.ADD hardware /dev/ttyACM0 - /dev/serial/by-id/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2215018-if00
21-11-16 08:48:44 INFO (MainThread) [supervisor.hardware.monitor] Detecting HardwareAction.REMOVE hardware /dev/ttyACM0 - /dev/serial/by-id/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2215018-if00
21-11-16 08:48:44 INFO (MainThread) [supervisor.hardware.monitor] Detecting HardwareAction.REMOVE hardware /dev/bus/usb/001/051 - None
21-11-16 08:48:52 INFO (MainThread) [supervisor.hardware.monitor] Detecting HardwareAction.ADD hardware /dev/bus/usb/001/053 - None
21-11-16 08:48:52 INFO (MainThread) [supervisor.hardware.monitor] Detecting HardwareAction.ADD hardware /dev/ttyACM0 - /dev/serial/by-id/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2215018-if00
21-11-16 08:48:59 INFO (MainThread) [supervisor.homeassistant.core] Detect a running Home Assistant instance

Host logs:

[492574.103114] iwlwifi 0000:00:14.3: No beacon heard and the time event is over already...
[492574.103192] wlp0s20f3: Connection to AP b8:e3:b1:81:ca:0d lost
[492575.908603] wlp0s20f3: authenticate with b8:e3:b1:81:ca:0d
[492575.914936] wlp0s20f3: send auth to b8:e3:b1:81:ca:0d (try 1/3)
[492575.968955] wlp0s20f3: authenticated
[492575.969238] wlp0s20f3: associate with b8:e3:b1:81:ca:0d (try 1/3)
[492575.999617] wlp0s20f3: RX AssocResp from b8:e3:b1:81:ca:0d (capab=0x31 status=0 aid=1)
[492576.010257] wlp0s20f3: associated
[492576.529148] iwlwifi 0000:00:14.3: No beacon heard and the time event is over already...
[492576.529224] wlp0s20f3: Connection to AP b8:e3:b1:81:ca:0d lost
[492578.090888] wlp0s20f3: authenticate with b8:e3:b1:81:ca:0d
[492578.092743] wlp0s20f3: send auth to b8:e3:b1:81:ca:0d (try 1/3)
[492578.145854] wlp0s20f3: authenticated
[492578.146181] wlp0s20f3: associate with b8:e3:b1:81:ca:0d (try 1/3)
[492578.177678] wlp0s20f3: RX AssocResp from b8:e3:b1:81:ca:0d (capab=0x31 status=0 aid=1)
[492578.180151] wlp0s20f3: associated
[494181.350300] audit: type=1334 audit(1637027171.759:839): prog-id=275 op=LOAD
[494181.350481] audit: type=1334 audit(1637027171.759:840): prog-id=276 op=LOAD
[494181.470542] audit: type=1334 audit(1637027171.879:841): prog-id=277 op=LOAD
[494181.470576] audit: type=1334 audit(1637027171.879:842): prog-id=278 op=LOAD
[494211.495675] audit: type=1334 audit(1637027201.906:843): prog-id=276 op=UNLOAD
[494211.495683] audit: type=1334 audit(1637027201.906:844): prog-id=275 op=UNLOAD
[494211.622587] audit: type=1334 audit(1637027202.033:845): prog-id=278 op=UNLOAD
[494211.622597] audit: type=1334 audit(1637027202.033:846): prog-id=277 op=UNLOAD
[497408.620337] wlp0s20f3: deauthenticating from b8:e3:b1:81:ca:0d by local choice (Reason: 3=DEAUTH_LEAVING)
[497410.519968] wlp0s20f3: authenticate with b8:e3:b1:81:ca:0d
[497410.525838] wlp0s20f3: send auth to b8:e3:b1:81:ca:0d (try 1/3)
[497410.578872] wlp0s20f3: authenticated
[497410.579872] wlp0s20f3: associate with b8:e3:b1:81:ca:0d (try 1/3)
[497410.607573] wlp0s20f3: RX AssocResp from b8:e3:b1:81:ca:0d (capab=0x31 status=0 aid=1)
[497410.611453] wlp0s20f3: associated
[501781.673742] audit: type=1334 audit(1637034772.211:847): prog-id=279 op=LOAD
[501781.673802] audit: type=1334 audit(1637034772.211:848): prog-id=280 op=LOAD
[501781.798573] audit: type=1334 audit(1637034772.336:849): prog-id=281 op=LOAD
[501781.798660] audit: type=1334 audit(1637034772.336:850): prog-id=282 op=LOAD
[501811.836903] audit: type=1334 audit(1637034802.376:851): prog-id=280 op=UNLOAD
[501811.836912] audit: type=1334 audit(1637034802.376:852): prog-id=279 op=UNLOAD
[501811.932911] audit: type=1334 audit(1637034802.472:853): prog-id=282 op=UNLOAD
[501811.932919] audit: type=1334 audit(1637034802.472:854): prog-id=281 op=UNLOAD
[504702.513831] wlp0s20f3: deauthenticating from b8:e3:b1:81:ca:0d by local choice (Reason: 3=DEAUTH_LEAVING)
[504704.433537] wlp0s20f3: authenticate with b8:e3:b1:81:ca:0d
[504704.439884] wlp0s20f3: send auth to b8:e3:b1:81:ca:0d (try 1/3)
[504704.496084] wlp0s20f3: authenticated
[504704.496657] wlp0s20f3: associate with b8:e3:b1:81:ca:0d (try 1/3)
[504704.525090] wlp0s20f3: RX AssocResp from b8:e3:b1:81:ca:0d (capab=0x31 status=0 aid=1)
[504704.532752] wlp0s20f3: associated
[509148.490999] usb 2-2: reset SuperSpeed Gen 1 USB device number 3 using xhci_hcd
[509148.502774] usb 2-2: LPM exit latency is zeroed, disabling LPM.
[509288.745139] usb 2-2: reset SuperSpeed Gen 1 USB device number 3 using xhci_hcd
[509288.757202] usb 2-2: LPM exit latency is zeroed, disabling LPM.
[509382.003996] audit: type=1334 audit(1637042372.668:855): prog-id=283 op=LOAD
[509382.004104] audit: type=1334 audit(1637042372.668:856): prog-id=284 op=LOAD
[509382.139385] audit: type=1334 audit(1637042372.804:857): prog-id=285 op=LOAD
[509382.139563] audit: type=1334 audit(1637042372.804:858): prog-id=286 op=LOAD
[509412.182919] audit: type=1334 audit(1637042402.849:859): prog-id=284 op=UNLOAD
[509412.182924] audit: type=1334 audit(1637042402.849:860): prog-id=283 op=UNLOAD
[509412.292045] audit: type=1334 audit(1637042402.958:861): prog-id=286 op=UNLOAD
[509412.292070] audit: type=1334 audit(1637042402.958:862): prog-id=285 op=UNLOAD
[512003.376971] wlp0s20f3: deauthenticating from b8:e3:b1:81:ca:0d by local choice (Reason: 3=DEAUTH_LEAVING)
[512005.032646] wlp0s20f3: authenticate with b8:e3:b1:81:ca:0d
[512005.038471] wlp0s20f3: send auth to b8:e3:b1:81:ca:0d (try 1/3)
[512005.089425] wlp0s20f3: authenticated
[512005.090497] wlp0s20f3: associate with b8:e3:b1:81:ca:0d (try 1/3)
[512005.120090] wlp0s20f3: RX AssocResp from b8:e3:b1:81:ca:0d (capab=0x31 status=0 aid=1)
[512005.123400] wlp0s20f3: associated
[514087.747560] usb 1-4: USB disconnect, device number 45
[514087.996778] usb 1-4: new full-speed USB device number 46 using xhci_hcd
[514088.124493] usb 1-4: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
[514088.124496] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[514088.124498] usb 1-4: Product: ConBee II
[514088.124500] usb 1-4: Manufacturer: dresden elektronik ingenieurtechnik GmbH
[514088.124501] usb 1-4: SerialNumber: DE2215018
[514088.125740] cdc_acm 1-4:1.0: ttyACM0: USB ACM device
[514091.577110] usb 1-4: USB disconnect, device number 46
[514091.880685] usb 1-4: new full-speed USB device number 47 using xhci_hcd
[514092.008196] usb 1-4: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
[514092.008200] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[514092.008202] usb 1-4: Product: ConBee II
[514092.008204] usb 1-4: Manufacturer: dresden elektronik ingenieurtechnik GmbH
[514092.008205] usb 1-4: SerialNumber: DE2215018
[514092.009373] cdc_acm 1-4:1.0: ttyACM0: USB ACM device
[514122.988108] usb 1-4: USB disconnect, device number 47
[514123.237194] usb 1-4: new full-speed USB device number 48 using xhci_hcd
[514123.364727] usb 1-4: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
[514123.364729] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[514123.364730] usb 1-4: Product: ConBee II
[514123.364731] usb 1-4: Manufacturer: dresden elektronik ingenieurtechnik GmbH
[514123.364732] usb 1-4: SerialNumber: DE2215018
[514123.365910] cdc_acm 1-4:1.0: ttyACM0: USB ACM device
[514126.819506] usb 1-4: USB disconnect, device number 48
[514127.123138] usb 1-4: new full-speed USB device number 49 using xhci_hcd
[514132.553070] usb 1-4: unable to read config index 0 descriptor/all
[514132.553077] usb 1-4: can't read configurations, error -110
[514132.667045] usb 1-4: new full-speed USB device number 50 using xhci_hcd
[514132.795683] usb 1-4: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
[514132.795685] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[514132.795687] usb 1-4: Product: ConBee II
[514132.795688] usb 1-4: Manufacturer: dresden elektronik ingenieurtechnik GmbH
[514132.795689] usb 1-4: SerialNumber: DE2215018
[514132.796865] cdc_acm 1-4:1.0: ttyACM0: USB ACM device
[515930.058389] usb 1-4: USB disconnect, device number 50
[515930.307537] usb 1-4: new full-speed USB device number 51 using xhci_hcd
[515930.434925] usb 1-4: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
[515930.434928] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[515930.434929] usb 1-4: Product: ConBee II
[515930.434931] usb 1-4: Manufacturer: dresden elektronik ingenieurtechnik GmbH
[515930.434932] usb 1-4: SerialNumber: DE2215018
[515930.436078] cdc_acm 1-4:1.0: ttyACM0: USB ACM device
[515934.169890] usb 1-4: USB disconnect, device number 51
[515934.472502] usb 1-4: new full-speed USB device number 52 using xhci_hcd
[515939.885746] usb 1-4: unable to read config index 0 descriptor/all
[515939.885762] usb 1-4: can't read configurations, error -110
[515939.999647] usb 1-4: new full-speed USB device number 53 using xhci_hcd
[515940.127922] usb 1-4: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
[515940.127927] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[515940.127929] usb 1-4: Product: ConBee II
[515940.127932] usb 1-4: Manufacturer: dresden elektronik ingenieurtechnik GmbH
[515940.127933] usb 1-4: SerialNumber: DE2215018
[515940.129344] cdc_acm 1-4:1.0: ttyACM0: USB ACM device
[516982.309310] audit: type=1334 audit(1637049973.101:863): prog-id=287 op=LOAD
[516982.309415] audit: type=1334 audit(1637049973.102:864): prog-id=288 op=LOAD
[516982.444764] audit: type=1334 audit(1637049973.237:865): prog-id=289 op=LOAD
[516982.444823] audit: type=1334 audit(1637049973.237:866): prog-id=290 op=LOAD
[517012.487965] audit: type=1334 audit(1637050003.282:867): prog-id=288 op=UNLOAD
[517012.487973] audit: type=1334 audit(1637050003.282:868): prog-id=287 op=UNLOAD
[517012.598050] audit: type=1334 audit(1637050003.392:869): prog-id=290 op=UNLOAD
[517012.598062] audit: type=1334 audit(1637050003.392:870): prog-id=289 op=UNLOAD

Sometimes I have to restart HA 2 or 3 times to get ZHA running

System Health

version core-2021.11.4
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.9.7
os_name Linux
os_version 5.10.75
arch x86_64
timezone Europe/Brussels
Home Assistant Community Store GitHub API | ok -- | -- Github API Calls Remaining | 4568 Installed Version | 1.17.2 Stage | running Available Repositories | 989 Installed Repositories | 34
Home Assistant Supervisor host_os | Home Assistant OS 6.6 -- | -- update_channel | beta supervisor_version | supervisor-2021.10.8 docker_version | 20.10.8 disk_total | 457.7 GB disk_used | 106.9 GB healthy | true supported | failed to load: Unsupported board | generic-x86-64 supervisor_api | ok version_api | ok installed_addons | Samba share (9.5.1), AdGuard Home (4.2.0), AppDaemon 4 (0.7.1), Caddy (1.6), Check Home Assistant configuration (3.9.0), Duck DNS (1.14.0), ESPHome (2021.10.3), Mosquitto broker (6.0.1), Studio Code Server (3.7.0), WireGuard (0.6.0), MariaDB (2.4.0), Samba Backup (5.0.0), InfluxDB (4.3.0), Grafana (7.3.1), Frigate NVR (2.3), SSH & Web Terminal (9.1.0), Metabase (0.3.0), Portainer (2.0.0), OwnTone (dev), Simple Scheduler (0.31), Spotify Connect (0.10.0), Glances (0.14.0), phpMyAdmin (0.6.0)
Lovelace dashboards | 3 -- | -- resources | 18 views | 18 mode | storage
Spotify api_endpoint_reachable | ok -- | --
agners commented 2 years ago

@gieljnssns hm, sudden disconnects, sometimes in a row. I'd guess something hardware related here. Maybe using a different port or/and a USB cable helps? There have been similar USB related issues reported lately with Intel NUCs, they turned out to be some type of hardware problem (see https://community.home-assistant.io/t/home-assistant-os-on-intel-nuc-no-usb-devices/355841/16).

gieljnssns commented 2 years ago

I already tried to change USB port and extension cable, didn't help.

I can upgrade to OS 7.0.rc1, can this be helpful?

agners commented 2 years ago

I can upgrade to OS 7.0.rc1, can this be helpful?

Unlikely, but who knows :)

gieljnssns commented 2 years ago

Are there any known issues with rc1?

agners commented 2 years ago

Not that I am aware of so far.

jle1n0 commented 2 years ago

@jle1n0 do you get continuously ADD and REMOVE hardware log entries? If not, this is unlikely related to this issue and your problem requires a separate new issue.

Ok. I just upgraded to 7.0 and faced the same issue. The system lost the whole Zigbee network behind Conbee II. I rebooted the system three times with the same result. Now I downgraded back to 6.5 and all is good again.

Shall I open a new issue or something? I can debug/collect logs tomorrow if needed.

agners commented 2 years ago

Now I downgraded back to 6.5 and all is good again.

So you only downgraded the OS and that made it work again?

Shall I open a new issue or something? I can debug/collect logs tomorrow if needed.

Yes please.

jle1n0 commented 2 years ago

Now I downgraded back to 6.5 and all is good again.

So you only downgraded the OS and that made it work again?

Shall I open a new issue or something? I can debug/collect logs tomorrow if needed.

Yes please.

Done: https://github.com/home-assistant/operating-system/issues/1680 I hope everything relevant is included there.

mbo18 commented 2 years ago

Regarding my issue, I have disabled bluetooth in BIOS and logs are gone 👍 Fortunately I don't use bluetooth

agners commented 2 years ago

Thanks for the update. I'll close it then, if someone else encounters a similar problem, please open a new issue.