home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.7k stars 30.82k forks source link

Ikea integration needs to be reloaded every morning #56141

Closed carloss66 closed 2 years ago

carloss66 commented 3 years ago

The problem

I have scripts to turn on or off Ikea light bulbs. One example is the Good Morning script. This script now fails to turn on the Ikea bulbs every morning. Nothing else Ikea related works in the morning, and the bulbs are greyed out in the HA user interface. For things to go back to normal, I have manually to reload the Ikea integration. That allows the integration to work until the end of the day, but the problem is back in the morning. The problem started with core-2021.9.1.

What is version of Home Assistant Core has the issue?

core-2021.9.6

What was the last working version of Home Assistant Core?

core-2021.8.8

What type of installation are you running?

Home Assistant OS

Integration causing the issue

IKEA TRÅDFRI

Link to integration documentation on our website

https://www.home-assistant.io/integrations/tradfri/

Example YAML snippet

alias: Good Morning
sequence:
  - data:
      entity_id: input_boolean.night_mode
    service: input_boolean.turn_off
  - data:
      entity_id: input_boolean.home_mode
    service: input_boolean.turn_on
  - data:
      brightness_pct: 15
      kelvin: 2700
    service: light.turn_on
    target:
      entity_id:
        - light.master_bedroom_left_lamp
        - light.master_bedroom_right_lamp
  - data:
      transition: 300
      brightness_pct: 75
    service: light.turn_on
    target:
      entity_id:
        - light.master_bedroom_left_lamp
        - light.master_bedroom_right_lamp
  - data:
      kelvin: 2700
      brightness_pct: 50
    service: light.turn_on
    target:
      entity_id:
        - light.living_room_left_lamp
        - light.living_room_right_lamp
  - service: switch.turn_on
    target:
      entity_id:
        - switch.kitchen_plug
        - switch.living_room_plug
mode: single

Anything in the logs that might be useful for us?

No response

Additional information

No response

janiversen commented 3 years ago

Actually it tells you what the problem are “ No suitable credentials for coaps://192.168.1.99” did the ikea gateway change ip address? It is very likely the same problem with the app.

It might be worse every day, especially since I am taking a break from the integration, after my last 2 PR (that would have solved the reload) was in reality rejected. Maybe another developer will take over.

peltsippi commented 3 years ago

Actually it tells you what the problem are “ No suitable credentials for coaps://192.168.1.99” did the ikea gateway change ip address? It is very likely the same problem with the app.

I am having same errors and my IKEA gateway has fixed ip configured on my internal DHCP.

Had to reboot ha 3 times today in a row before all started working again.

Ps I am really sorry that your PRs got rejected. I really appreciate your effort for trying to fix this quickly.

Mariusthvdb commented 3 years ago

And yet again, now showing

Logger: pytradfri.api.aiocoap_api
Source: /usr/local/lib/python3.9/site-packages/pytradfri/api/aiocoap_api.py:168 
First occurred: 17:08:07 (11 occurrences) 
Last logged: 23:04:46

Protocol is shutdown, cancelling command: 192.168.1.99 <Command put ['15001', 65709]: {'3311': [{'5850': 0}]}>
Protocol is shutdown, cancelling command: 192.168.1.99 <Command put ['15001', 65721]: {'15025': [{'5900': 0}]}>
Protocol is shutdown, cancelling command: 192.168.1.99 <Command put ['15001', 65641]: {'3312': [{'5850': 1}]}>
Protocol is shutdown, cancelling command: 192.168.1.99 <Command get ['15011', '15012']>
Protocol is shutdown, cancelling command: 192.168.1.99 <Command put ['15001', 65714]: {'3311': [{'5851': 0, '5712': 50}]}>

to be followed by:

Logger: coap
Source: /usr/local/lib/python3.9/site-packages/aiocoap/messagemanager.py:197 
First occurred: 23:08:14 (1 occurrences) 
Last logged: 23:08:14

Received Type.ACK from <aiocoap.transports.tinydtls.DTLSClientConnection object at 0x7f99e908b0>, but could not match it to a running exchange.

Btw, my Tradfri bridge has been on up .99 for 4 years now ...

Mariusthvdb commented 3 years ago

Actually it tells you what the problem are

“ No suitable credentials for coaps://192.168.1.99”

did the ikea gateway change ip address?

It is very likely the same problem with the app.

It might be worse every day, especially since I am taking a break from the integration, after my last 2 PR (that would have solved the reload) was in reality rejected. Maybe another developer will take over.

Was this when @MartinHjelmare asked you to change some code? He's always very precise, as we in the end all profit from.

Please do not leave, you are priceless for us . And we truly need you

Mariusthvdb commented 3 years ago

just another thought (and no more than that, because there are no docs on it as far as I know), and I tried to think beyond the box..... could it be there is some upper limit to the connected devices on the Ikea Bridge?

Things started to go haywire in my setup a couple of releases ago, but maybe..... that coincided with me adding 2 Starkvind air purifiers (that currently don't work too well I Ha to start with).

If either they reached the max amount of devices, or if these particular devices cause trouble, that might be a thing to investigate?

update suppose the documented upper limit per App (which ofc isnt the same as the Bridge but ets take it at that for now) is 50 devices. My setup is nearing that indeed, so I took out a few unused devices, just to be within specs. (Again, not that any error message hinted in that direction, but just to be sure). Unfortunately, the same scenario keeps happening during the daytime. Figured some temporary fix is needed, and until the HA integration behaves better, will test this automation:

  - alias: Reload Trafdri integration
    id: Reload Trafdri integration
    trigger:
      platform: time_pattern
      hours: /1
    condition:
      condition: time
      after: input_datetime.alarmclock_wd
      before: input_datetime.sleeptime_today
    action:
      service: script.reload_tradfri_integration

or more precise maybe:

  - alias: Reload Trafdri integration
    id: Reload Trafdri integration
    trigger:
      - platform: event
        event_type: system_log_event
        event_data:
          level: WARNING
        id: observation
      - platform: event
        event_type: system_log_event
        event_data:
          level: ERROR
        id: failed
    condition:
      condition: or
      conditions:
        - >
          {{'[homeassistant.components.tradfri.base_class] Observation failed for'
            in trigger.event.data.message and trigger.id == 'observation'}}
        - >
          {{'[homeassistant.components.tradfri] Keep-alive failed'
           in trigger.event.data.message and trigger.id == 'failed'}}

    action:
      - service: script.reload_tradfri_integration
      - service: system_log.write
        data:
          message: >
            Ikea integration was reloaded because of {{trigger.id}} with message:
            {{trigger.event.data.message}}
          level: warning
          logger: homeassistant.components.tradfri
Mariusthvdb commented 2 years ago

so found some more logs showing us the need to reload the integration, posted in https://github.com/home-assistant/core/issues/59392#issuecomment-974488518 too

suppose these spread Tradfri issues dont help in making clear the integration currently is bordering breaking Ha. Hoe can be bring this to the attention of the core dev team without tagging?

fwiw, this seems to catch the most prominent issues, and automate the reloading:

  - alias: Reload Tradfri integration
    id: Reload Tradfri integration
    mode: restart
    trigger:
      - platform: event
        event_type: system_log_event
        event_data:
          level: WARNING
        id: observation
      - platform: event
        event_type: system_log_event
        event_data:
          level: ERROR
        id: failed
    condition:
      condition: or
      conditions:
        - >
          {{'tradfri.base_class] Observation failed for'
            in trigger.event.data.message[0]}}
        - >
          {{'[tradfri] Keep-alive failed'
           in trigger.event.data.message[0]}}
        - >
          {{'Error received in UDP connection under DTLS' in trigger.event.data.message[0]}}
    action:
      - service: system_log.write
        data:
          message: >
            Ikea Tradfri '{{trigger.id}}' issue logged, reloading integration
          level: warning
          logger: homeassistant.components.tradfri
      - service: script.reload_tradfri_integration
      - service: system_log.write
        data:
          message: >
            Ikea Tradfri integration was reloaded because of {{trigger.id}}
          level: warning
          logger: homeassistant.components.tradfri
      - service: persistent_notification.create
        data:
          title: >
            {{trigger.event.data.level}}: Tradfri reloaded
          message: >
            {{now().timestamp()|timestamp_custom('%X')}}: Ikea Tradfri integration was
              reloaded because of {{trigger.id}}:
              Message: {{trigger.event.data.message[0]}},
              Logger: {{trigger.event.data.name}},
              Source: {{trigger.event.data.source}},
              Level: {{trigger.event.data.level}}
Mariusthvdb commented 2 years ago

keeping this alive. still happening on 2021.12.0.dev20211124 which should use pytradfri 7.2.0.

RosemaryOrchard commented 2 years ago

I have also seen this issue with similar logs. I didn't realise that 50 is the maximum number of devices on the gateway, but that's the number I (now) have. I'll try pairing some things to my Zigbee controller instead and seeing if the issue goes away.

Mariusthvdb commented 2 years ago

tbh, thats what I did, and it didn't help... I fear its more basic then that.

osmaa commented 2 years ago

My Trådfri gateway with 41 devices loses connection to Home Assistant sometimes multiple times a day - I've had this problem for months. I used to reboot the whole RPi running HA until I realized I could try just reloading the integration. When I do, the queued actions to the lights all trigger in sequence, which could mean lights flash a lot if it was down for some time.

@Mariusthvdb what is in script.reload_tradfri_integration your automation refers to?

HA core 2021.11.5 on Home Assistant OS 6.6

Possibly related log messages, though these don't appear every time the integration is stuck.

Logger: homeassistant.components.tradfri.base_class
Source: components/tradfri/base_class.py:41 
Integration: IKEA TRÅDFRI (documentation, issues) 
First occurred: November 28, 2021, 7:32:48 PM (838 occurrences) 
Last logged: 9:12:50 AM

Unable to execute command <Command put ['15001', 65580]: {'3311': [{'5851': 43, '5712': 50, '5708': 65279, '5707': 50243}]}>: {"r":"09"}
Unable to execute command <Command put ['15001', 65580]: {'3311': [{'5851': 43, '5712': 50, '5708': 65279, '5707': 51335}]}>: {"r":"09"}
Unable to execute command <Command put ['15001', 65538]: {'3311': [{'5850': 0}]}>: {"r":"09"}
Unable to execute command <Command put ['15001', 65580]: {'3311': [{'5851': 43, '5712': 50, '5708': 65279, '5707': 52428}]}>: {"r":"09"}
Unable to execute command <Command put ['15001', 65580]: {'3311': [{'5851': 43, '5712': 50, '5708': 65279, '5707': 53520}]}>: {"r":"09"}

Logger: pytradfri.api.aiocoap_api
Source: /usr/local/lib/python3.9/site-packages/pytradfri/api/aiocoap_api.py:168 
First occurred: November 28, 2021, 7:32:47 PM (33 occurrences) 
Last logged: 9:12:25 AM

Protocol is shutdown, cancelling command: 192.168.133.42 <Command put ['15001', 65543]: {'3311': [{'5711': 442, '5712': 100}]}>
Protocol is shutdown, cancelling command: 192.168.133.42 <Command get ['15011', '15012']>
Protocol is shutdown, cancelling command: 192.168.133.42 <Command put ['15001', 65580]: {'3311': [{'5851': 92, '5712': 50, '5708': 65279, '5707': 52428}]}>

Logger: homeassistant.core
Source: components/tradfri/base_class.py:39 
First occurred: November 28, 2021, 7:32:47 PM (1 occurrences) 
Last logged: November 28, 2021, 7:32:47 PM

Error executing service: <ServiceCall light.turn_on (c:cf770e6926ec1359a4e6a4510ce964f8): entity_id=['light.makkarin_valo'], params=color_temp=442, transition=10.0>
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/core.py", line 1511, in catch_exceptions
    await coro_or_task
  File "/usr/src/homeassistant/homeassistant/core.py", line 1530, in _execute_service
    await handler.job.target(service_call)
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 213, in handle_service
    await self.hass.helpers.service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 667, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 863, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 704, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 496, in async_handle_light_on_service
    await light.async_turn_on(**filter_turn_on_params(light, params))
  File "/usr/src/homeassistant/homeassistant/components/tradfri/light.py", line 270, in async_turn_on
    await self._api(command)
  File "/usr/src/homeassistant/homeassistant/components/tradfri/base_class.py", line 39, in wrapper
    await func(command)
  File "/usr/local/lib/python3.9/site-packages/pytradfri/api/aiocoap_api.py", line 198, in request
    result = await self._execute(api_commands)
  File "/usr/local/lib/python3.9/site-packages/pytradfri/api/aiocoap_api.py", line 166, in _execute
    _, res = await self._get_response(msg)
  File "/usr/local/lib/python3.9/site-packages/pytradfri/api/aiocoap_api.py", line 106, in _get_response
    pr_resp = await pr_req.response
  File "/usr/local/lib/python3.9/site-packages/aiocoap/protocol.py", line 731, in _run_outer
    await cls._run(app_request, response, weak_observation, protocol, log)
  File "/usr/local/lib/python3.9/site-packages/aiocoap/protocol.py", line 783, in _run
    blockresponse = await blockrequest.response
  File "/usr/local/lib/python3.9/site-packages/aiocoap/protocol.py", line 313, in send
    request_interface.request(plumbing_request)
  File "/usr/local/lib/python3.9/site-packages/aiocoap/tokenmanager.py", line 243, in request
    self.outgoing_requests[key] = request
TypeError: 'NoneType' object does not support item assignment
RosemaryOrchard commented 2 years ago

Last week I pared down the number of devices paired to my hub from 50 to 29 (and somehow it went back up to 30, definite flakiness on the IKEA hub side). As the hub was not good at telling Home Assistant it had fewer devices, I removed the integration and re-added it. Since then I have only had one dropout. So it's possible (though far from definite) that the higher number of devices connected to the hub was causing the issue.

Mariusthvdb commented 2 years ago

Think I will give that a try too, who knows.... Though here, the integration did take out the devices I had deleted from within the app correctly

as for the log messages: this is the order of things I now see, and note that my automatic Integration restart is logged too, possibly causing several of the other log entries:

2021-11-29 15:31:32 WARNING (MainThread) [coap] Error received in UDP connection under DTLS: [Errno 111] Connection refused
2021-11-29 15:31:32 WARNING (MainThread) [homeassistant.components.tradfri] Ikea Tradfri 'observation' issue logged, reloading integration
2021-11-29 15:31:32 WARNING (MainThread) [pytradfri.api.aiocoap_api] Protocol is shutdown, cancelling command: 192.168.1.99 <Command get ['15011', '15012']>
2021-11-29 15:31:32 WARNING (MainThread) [pytradfri.api.aiocoap_api] Protocol is shutdown, cancelling command: 192.168.1.99 <Command get ['15011', '15012']>
2021-11-29 15:31:32 WARNING (MainThread) [pytradfri.api.aiocoap_api] Protocol is shutdown, cancelling command: 192.168.1.99 <Command get ['15011', '15012']>
2021-11-29 15:31:32 WARNING (MainThread) [pytradfri.api.aiocoap_api] Protocol is shutdown, cancelling command: 192.168.1.99 <Command get ['15011', '15012']>
2021-11-29 15:31:32 WARNING (MainThread) [coap] Error received in UDP connection under DTLS: [Errno 111] Connection refused
2021-11-29 15:31:42 WARNING (MainThread) [homeassistant.components.tradfri] Ikea Tradfri 'observation' issue logged, reloading integration
2021-11-29 15:31:52 WARNING (MainThread) [homeassistant.components.tradfri] Ikea Tradfri integration was reloaded because of observation
2021-11-29 15:32:02 WARNING (MainThread) [homeassistant.config_entries] Config entry '192.168.1.99' for tradfri integration not ready yet: ('Request timed out.', TimeoutError()); Retrying in background
2021-11-29 15:32:12 WARNING (MainThread) [homeassistant.config_entries] Config entry '192.168.1.99' for tradfri integration not ready yet: ('Request timed out.', TimeoutError()); Retrying in background
2021-11-29 15:33:49 ERROR (MainThread) [homeassistant.config_entries] Error setting up entry 192.168.1.99 for cover
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/config_entries.py", line 311, in async_setup
    result = await component.async_setup_entry(hass, self)  # type: ignore
  File "/usr/src/homeassistant/homeassistant/components/cover/__init__.py", line 165, in async_setup_entry
    return await component.async_setup_entry(entry)
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 162, in async_setup_entry
    raise ValueError("Config entry has already been setup!")
ValueError: Config entry has already been setup!
2021-11-29 15:33:49 ERROR (MainThread) [homeassistant.config_entries] Error setting up entry 192.168.1.99 for fan
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/config_entries.py", line 311, in async_setup
    result = await component.async_setup_entry(hass, self)  # type: ignore
  File "/usr/src/homeassistant/homeassistant/components/fan/__init__.py", line 214, in async_setup_entry
    return await component.async_setup_entry(entry)
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 162, in async_setup_entry
    raise ValueError("Config entry has already been setup!")
ValueError: Config entry has already been setup!
2021-11-29 15:33:49 ERROR (MainThread) [homeassistant.config_entries] Error setting up entry 192.168.1.99 for light
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/config_entries.py", line 311, in async_setup
    result = await component.async_setup_entry(hass, self)  # type: ignore
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 538, in async_setup_entry
    return await component.async_setup_entry(entry)
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 162, in async_setup_entry
    raise ValueError("Config entry has already been setup!")
ValueError: Config entry has already been setup!
2021-11-29 15:33:49 ERROR (MainThread) [homeassistant.config_entries] Error setting up entry 192.168.1.99 for sensor
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/config_entries.py", line 311, in async_setup
    result = await component.async_setup_entry(hass, self)  # type: ignore
  File "/usr/src/homeassistant/homeassistant/components/sensor/__init__.py", line 131, in async_setup_entry
    return await component.async_setup_entry(entry)
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 162, in async_setup_entry
    raise ValueError("Config entry has already been setup!")
ValueError: Config entry has already been setup!
2021-11-29 15:33:49 ERROR (MainThread) [homeassistant.config_entries] Error setting up entry 192.168.1.99 for switch
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/config_entries.py", line 311, in async_setup
    result = await component.async_setup_entry(hass, self)  # type: ignore
  File "/usr/src/homeassistant/homeassistant/components/switch/__init__.py", line 79, in async_setup_entry
    return await component.async_setup_entry(entry)
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 162, in async_setup_entry
    raise ValueError("Config entry has already been setup!")
ValueError: Config entry has already been setup!

@osmaa my script is:

script:

  reload_tradfri_integration:
    alias: Reload Tradfri integration
    mode: restart
    sequence:
      service: homeassistant.reload_config_entry
      data:
        entry_id: 37702840aa084ed4af2155b0811b1951

find the right entry_id in the core.config_entries

peltsippi commented 2 years ago

Just got behaviour that might be connected to the issue.

Tried to switch on a switch and it did not work because integration was stuck again (20 minutes after automated integration reload). Then I restarted homeassistant and switch instantly reacted and turned on during restart before Integration was reloaded. There was nothing else turning on the switch except my manual toggle through UI.

Sorry no logs or anything because reset wipes everything on my setup. But this is something one can test next time integration freezes and see if functionality is similar for everyone.

Mariusthvdb commented 2 years ago

Check https://ww8.ikea.com/ikeahomesmart/releasenotes/releasenotes.html and see improvement . Seems to no longer lockup ? ( some lights don’t respond) but the HA interface remains responsive ? #fingerscrossed

peltsippi commented 2 years ago

Actually my light bulbs were updated 2 days ago(noticed from bulbs restarting) and I updated ha yesterday and no issues for now.

peltsippi commented 2 years ago

Been testing since and only issue I've had is the old usual: if trådfri gateway is restarted, ha needs also to be restarted or integration breaks.

No other issues with anything so worksforme at least for now.

github-actions[bot] commented 2 years ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.