home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.67k stars 30.42k forks source link

Automation with MQTT actions 'Stopped because of unknown reason "null" ' #98664

Closed jeroen85 closed 1 year ago

jeroen85 commented 1 year ago

The problem

As suggested by @allenporter in https://github.com/home-assistant/core/issues/97721 I create this new issue.

Not sure if it is the same issue as in https://github.com/home-assistant/core/issues/97721 ... I experience an issue with an automation (blueprint) activating a scene after. The scene is not executed properly (light does not turn off in this case). Sometimes it works, sometimes it does not work. The scene turns off some lights in the living room and 2 in the dining room. The issues are with the 2 on the dining rooml Sometimes only one of the two lights turn off. Sometimes, both stay on, Sometimes both turn off...

Button: Zigbee (Hue Tap) Lights: Philips Hue lights Both operate via Zigbee2MQTT.

Worked flawlessly the last 2 years.

Also works OK when triggered via dev tools, directly via scene configuration page or when 'run' the action within the automation configuration page.

It feels like the scene execution is stopped too early.

Screenshot 2023-08-19 at 08 45 06

Automation/blueprint

mode: restart
max_exceeded: silent
trigger:
  - platform: state
    entity_id: sensor.hue_tap_keuken_action
action:
  - variables:
      command: '{{ trigger.to_state.state }}'
  - choose:
      - conditions:
          - '{{ command == ''press_1'' }}'
        sequence:
          - service: scene.turn_on
            target:
              entity_id: scene.beneden_uit
            metadata: {}
      - conditions:
          - '{{ command == ''press_2'' }}'
        sequence:
          - service: scene.turn_on
            target:
              entity_id: scene.backup_hue_lampen_beneden_tap_2
            metadata: {}
      - conditions:
          - '{{ command == ''press_3'' }}'
        sequence:
          - service: scene.turn_on
            target:
              entity_id: scene.backup_keukentafel_tap_3
            metadata: {}
      - conditions:
          - '{{ command == ''press_4'' }}'
        sequence:
          - service: scene.turn_on
            target:
              entity_id: scene.backup_keukentafel_tap_4
            metadata: {}
id: '1658572961723'
alias: Woonkamer / Keuken - Hue Tap
description: ''

Trace trace automation.woonkamer_keuken_hue_tap 2023-08-18T20_25_31.437655+00_00.txt

Scene entities First one (woonkamer) works OK, 2nd and 3rd ones (Eettafel) give the issues.

Screenshot 2023-08-19 at 09 03 15

HA version: 2023.8.2

Additional info I thought, adding a 1s delay after the scene activation might be a workaround. But it didn't work

Screenshot 2023-08-19 at 09 20 11

When I directly turn off the lights within the automation. It works OK (it turns off all the lights), but the red 'null' error still occurs

service: light.turn_off
data: {}
target:
  entity_id:
    - light.eettafel_links
    - light.eettafel_rechts
    - light.woonkamer

What version of Home Assistant Core has the issue?

2023.8.2

What was the last working version of Home Assistant Core?

I guess before the 2023.8 releas

What type of installation are you running?

Home Assistant Container

Integration causing the issue

MQTT

Link to integration documentation on our website

https://www.home-assistant.io/integrations/mqtt/

Diagnostics information

config_entry-mqtt-015cc97c23625204817cd8654f3b7d7c.json.txt

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

No response

home-assistant[bot] commented 1 year ago

Hey there @emontnemery, @jbouwh, mind taking a look at this issue as it has been labeled with an integration (mqtt) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `mqtt` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign mqtt` Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


mqtt documentation mqtt source (message by IssueLinks)

jbouwh commented 1 year ago

What is the MQTT specific case in here? Can you share some core logs with the errors traces logged. If it is MQTT specific you can get additional info by turning on debug logging for MQTT.

jeroen85 commented 1 year ago

@allenporter asked me to report this issue against the integration that owns the light. In my case it is zigbee2mqtt, which uses mqtt. I guess he has an idea about the 'root cause' of these issues...

Within the logs I do not see anything with respect to this issue. But also no SET actions with respect to the light which is not working.

Example, I use the automation to turn on a scene. The scene should turn ON light 'woonkamer' and 2 lights 'eettafel'. The light 'woonkamer' turns ONN. The light 'eettafel' stays OFF

In the debug loggin I only see this line. Nothing related to turning on 'eettafel' 2023-08-19 11:24:33.900 DEBUG (MainThread) [homeassistant.components.mqtt.client] Transmitting message on zigbee2mqtt/Woonkamer/set: '{"state":"ON","color":{"x":0.4893,"y":0.3659},"brightness":116}', mid: 194, qos: 0

So maybe it is not related to MQTT, but more related to some other part of HA (light, scenes?)? But I don't have enough knowledge to identify the right integration.

joostlek commented 1 year ago

Is the scene turning on also disabling the current automation?

jbouwh commented 1 year ago

If you see no error after the debug message, then it is unlikely it is caused by the MQTT entity.

allenporter commented 1 year ago

Thanks for engaging, lets see if we can figure out whats going wrong together. I'm not a script/scene expert, so trying to understand possible things that could go wrong here:

I think mqtt could be raising an error there (without logging anything, which isn't necessarily a problem) but script is not necessarily logging the error in a helpful way.

One step we can take can be to improve that logging, but there does seem possible there is an mqtt error here.

allenporter commented 1 year ago

I tried some unit testing of the script and when the scene service raises HomeAssistantError("It is not working!") I see do see a log message that is better:

ERROR:homeassistant.helpers.script.test_name:Test Name: Error executing script. Error for scene at pos 1: It is not working!

so perhaps something else going on here then.

allenporter commented 1 year ago

@jeroen85 Can you turn up debug logging for homeassistant.helpers.script? I think it would be helpful to see the debug Executing step log messages.

jeroen85 commented 1 year ago

Is the scene turning on also disabling the current automation?

@joostlek No

@allenporter Turned debug on for homeassistant.helpers.script. Nothing appeared in the log.

allenporter commented 1 year ago

Is the scene turning on also disabling the current automation?

@joostlek No

@allenporter Turned debug on for homeassistant.helpers.script. Nothing appeared in the log.

Ok I may have the wrong package. Maybe it ends up under homeassistant.components.automation. Will look more tomorrow...

jeroen85 commented 1 year ago

When I look into the core debug logging, these 3 lines appear. I am not an expert, but I guess the last 2 are to turn off my light. In this specific case, only one of the lights turned of.

2023-08-20 08:34:04.954 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=scene.beneden_uit, old_state=<state scene.beneden_uit=2023-08-20T06:32:16.812088+00:00; entity_id=['light.woonkamer', 'light.eettafel_links', 'light.eettafel_rechts'], id=1635754330772, friendly_name=Beneden uit @ 2023-08-20T08:32:16.812178+02:00>, new_state=<state scene.beneden_uit=2023-08-20T06:34:04.954297+00:00; entity_id=['light.woonkamer', 'light.eettafel_links', 'light.eettafel_rechts'], id=1635754330772, friendly_name=Beneden uit @ 2023-08-20T08:34:04.954387+02:00>>
2023-08-20 08:34:04.958 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event call_service[L]: domain=light, service=turn_off, service_data=entity_id=light.eettafel_links>
2023-08-20 08:34:04.959 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event call_service[L]: domain=light, service=turn_off, service_data=entity_id=light.eettafel_rechts>

The homeassistant.components.automation debug logs indicated a restart of the automation. It appears that once the button (on the Hue Tap) is pressed the automation is started. Once the button is released, a new event is triggering the automation to restart. This consequently stops the 1st automation somewhere in the middle of performing its actions. This might also explain why sometimes both lights turned off, sometimes only one turned off etc. I changed the mode of the blueprint to single, which solved my specific issues.

2023-08-20 08:34:04.942 DEBUG (MainThread) [homeassistant.components.automation.woonkamer_keuken_hue_tap] Automation trigger 'None' triggered by state of sensor.hue_tap_keuken_action
2023-08-20 08:34:04.942 INFO (MainThread) [homeassistant.components.automation.woonkamer_keuken_hue_tap] Woonkamer / Keuken - Hue Tap: Restarting
2023-08-20 08:34:04.946 INFO (MainThread) [homeassistant.components.automation.woonkamer_keuken_hue_tap] Woonkamer / Keuken - Hue Tap: Running automation actions

Still does not explain (at least for me) why this wasn't an issue the last 2 years. And the error message could be improved Stopped because of unknown reason "null"

Anyway, thanks for you help! The debugging steps helped me solve my specific issue :-)

jbouwh commented 1 year ago

I understand the entities are MQTT entities. But how are the scenes created, ?

jeroen85 commented 1 year ago

A scene with one group of lights (woonkamer) and two individual lights (eettafel).

1000019160

jbouwh commented 1 year ago

So a scene you manually created from the UI

jeroen85 commented 1 year ago

Correct

jbouwh commented 1 year ago

I have been testing, and it seems when I put some hue lights in a scene can call that it works fine. When MQTT lights are put in a scene nothing happens. But I see no errors in the log. I have debug logging on by default for MQTT. The automaton triggers the service call:

2023-08-20 15:32:25.286 INFO (MainThread) [homeassistant.components.automation.test_98664] Test #98664: Restarting
2023-08-20 15:32:25.290 INFO (MainThread) [homeassistant.components.automation.test_98664] Test #98664: Running automation actions
2023-08-20 15:32:25.290 INFO (MainThread) [homeassistant.components.automation.test_98664] Test #98664: Executing step setting variables
2023-08-20 15:32:25.295 INFO (MainThread) [homeassistant.components.automation.test_98664] Test #98664: Choose at step 2: choice 1: Running automation actions
2023-08-20 15:32:25.295 INFO (MainThread) [homeassistant.components.automation.test_98664] Test #98664: Choose at step 2: choice 1: Executing step call service

When I try to call the scene with the MQTT lights from the developer tools it does not work either. Can you confirm that?

jbouwh commented 1 year ago

It seems async_turn_on of mqtt json light (the same lights Z2M uses) is not called when scene.turn_on is called with MQTT lights.. When I just call the service light turn_on. It works as expected. It seems further digging is needed.

jeroen85 commented 1 year ago

Calling a scene (with MQTT / Z2M lights) via scene.turn_on (dev tools) works without any issue in my installation.

In my case it appeared that the mode of the automation (restart) in combination with a Hue Tap device was the issue. The Hue tap creates an action on press and on release when a button is pressed.

During the first on press action, the scene was called via the automation. The automation stopped this first run due to the 'on release' trigger a few milliseconds later. I guess it also stops the 'execution' of the scene. Setting the mode of automation to single solved my issues.

jbouwh commented 1 year ago

Okay, It was working finally. Only my test conditions were not ideal. I saw now MQTT response be case the entity state was already the desired state. So now is the question if this this is really MQTT related or not. May be the states in the scene you try to restore have an issue. Can you share scenes.yaml Only the scene that causes trouble. So it only errors when the automation is in restart mode.

jbouwh commented 1 year ago

When I create a scene with 3 MQTT lights and create an automation in restart mode, And let the state of an entity trigger to turn on the scene, then that works fine in my test env. May be there is an other side effect causing the issue?

allenporter commented 1 year ago

Calling a scene (with MQTT / Z2M lights) via scene.turn_on (dev tools) works without any issue in my installation.

In my case it appeared that the mode of the automation (restart) in combination with a Hue Tap device was the issue. The Hue tap creates an action on press and on release when a button is pressed.

During the first on press action, the scene was called via the automation. The automation stopped this first run due to the 'on release' trigger a few milliseconds later. I guess it also stops the 'execution' of the scene. Setting the mode of automation to single solved my issues.

It sounds like we should close this issue then? everything seems to be working as intended though maybe something change with the hue button events, unclear, but either way the on release is what caused the stop to happen.

Shall we close this or is there something else we're tracking down here?

jbouwh commented 1 year ago

The way the automation is triggered is indeed what can cause the issue. A good thing might be too check the hass context too. But , to be able to get progress on this issue, first there should be a usecase that can reproduced. That seems not the case afterall with this one. It is not very likely an MQTT issue, I would have expected more in the logs in that case.

allenporter commented 1 year ago

I guess i'm confused about what other issues are left here? It seems like the automation getting cancelled/aborted is the primary cause and i'm not sure what else needs to be investigated. (Since i encouraged this to be filed against the integration, just wanted to be considerate of your time)

jeroen85 commented 1 year ago

I agree with you both:

I was unable to get to this conclusion myself at first due to the vague error message Stopped because of unknown reason "null" This could/should be improved, for example Stopped because of restarting the automation

Thanks Jan and Allen for your respons/help!