home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.27k stars 30.25k forks source link

merge_response mutates original response objects #127510

Open igrybkov opened 4 days ago

igrybkov commented 4 days ago

The problem

New template function merge_response modifies original object. This causes the issue that it can't be applied more than once because for the second time it fails with conflicting keys. I'd expect it to keep original objects unmodified.

What version of Home Assistant Core has the issue?

core-2024.10.0

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

No response

Link to integration documentation on our website

https://www.home-assistant.io/docs/configuration/templating/#merge-action-responses

Diagnostics information

No response

Example YAML snippet

variables:
  raw: "{{ calendar_response }}"
  merge: "{{ merge_response(calendar_response) | sort(attribute='start') }}"
  modified_raw: "{{ calendar_response }}"

Outputs:

raw:
  calendar.family:
    events:
      - start: "2024-10-04T19:30:00-05:00"
        end: "2024-10-04T20:30:00-05:00"
        summary: "An event"
      - start: "2024-10-05T08:00:00-05:00"
        end: "2024-10-05T12:00:00-05:00"
        summary: Another event
merge:
  - start: "2024-10-04T19:30:00-05:00"
    end: "2024-10-04T20:30:00-05:00"
    summary: "An event"
    entity_id: calendar.family
    value_key: events
  - start: "2024-10-05T08:00:00-05:00"
    end: "2024-10-05T12:00:00-05:00"
    summary: Another event
    entity_id: calendar.family
    value_key: events
modified_raw:
  calendar.family:
    events:
      - start: "2024-10-04T19:30:00-05:00"
        end: "2024-10-04T20:30:00-05:00"
        summary: "An event"
        entity_id: calendar.family
        value_key: events
      - start: "2024-10-05T08:00:00-05:00"
        end: "2024-10-05T12:00:00-05:00"
        summary: Another event
        entity_id: calendar.family
        value_key: events

Notice these keys added to all events:

        entity_id: calendar.family
        value_key: events

If I try to use the function again, it raises ValueError: Response dictionary already contains key 'entity_id'



### Anything in the logs that might be useful for us?

_No response_

### Additional information

_No response_
igrybkov commented 4 days ago

@gjohansson-ST pinging you since I saw it was your PR

gjohansson-ST commented 4 days ago

Yes, so it's not an issue as it's made as such on purpose, it even is mentioned in the documentation It was obviously not constructed to use the function twice and what's your use-case for doing so?

igrybkov commented 3 days ago

@gjohansson-ST ah sorry, I totally missed that part of the documentation. The use-case is doing any pre-check on the data before processing. In my script I check if the merged list is empty and stop the flow. Otherwise, proceed to build llm prompt. There are workarounds for this (I'm using (calendar_response.values() | map(attribute='events') | map('length') | list | sum) == 0 now), but the most obvious attempt to try using the new function in two places just failed.

igrybkov commented 3 days ago

I re-checked the documentation again, and even with the context you provided it says nothing about not being able to use the function twice, and that the input would be modified:

The entity_id key is appended to each dictionary within the template output list as a reference of origin. If the input dictionary already contains an entity_id key, the template will fail.

From this description I would expect the output to have entity_id added, but not the input.

If you insist that current behavior is correct, could you please make the limitation more explicit in the documentation? Thanks!

gjohansson-ST commented 3 days ago

We can of course clarify this in the documentation but it does say that the output is adding the entity_id´ key and alsovalue_key` if the input was a list. Isn't it quite obvious that the output becomes the input for a next one?

If those keys are already present we don't want to overwrite them, hence the reason to raise ValueError. I will make a note to update the documentation that if those key's are already present, there will be an error.

igrybkov commented 2 days ago

@gjohansson-ST, I'm sorry, are you sure we're talking about the same thing?

I think what you mean is {{ merge_response(merge_response(calendar_response)) }}. This is expected to fail because merge_response modifies data structure in the output and that's the whole point.

What I mean is that a template like this fails:

{% set calendar_response = {"calendar.family": {"events": [{"summary": "An event"}]}} %}
{{ merge_response(calendar_response) }}
{{ merge_response(calendar_response) }}

^ you can copy-paste this to developer tools to check

There's no piping, no overwriting variables. It's just that merge_response modifies original calendar_response during the processing and for the second call the calendar_response is not the same as it was for the first one. I think this is totally unexpected side effect. None of the template functions I remember modify the data they take as an input. Map doesn't modify original objects. Select doesn't write to the input.

gjohansson-ST commented 2 days ago

Ah. That's much clearer. I would agree that seems unexpected. Will look at it shortly 👍