legrego / homeassistant-elasticsearch

Publish Home-Assistant events to Elasticsearch
https://legrego.github.io/homeassistant-elasticsearch/
MIT License
143 stars 38 forks source link

Publish all states at interval and state changes #256

Closed strawgate closed 1 month ago

strawgate commented 1 month ago

Publishing state changes is a great way to track changes to an entity whose state changes rapidly (many times per minute).

Publishing all is a great way to ensure you can visualize data in kibana without gaps.

We should consider a combination that ensures kibana visualizes without gaps but also publishes state changes that occur with a higher frequency than ES document publishing.

strawgate commented 1 month ago

Perhaps we can add another mode where if an entity has had a state change in the last interval we remove it from the "all entities" publishing list for that interval?

strawgate commented 1 month ago

Well this is good timing https://developers.home-assistant.io/blog/2024/03/20/state_reported_timestamp/

The State object is now always updated and an event is always fired when an integration sets the state of an entity, regardless of any change to the state or a state attribute. This is implemented by adding a new timestamp, State.last_reported and a new event state_reported.

We currently don't send documents if the new value is the old value so I don't think this impacts current users.

Perhaps the new mode just doesn't do that deduplication and sends a document every time a device reports in.

legrego commented 1 month ago

We should consider a combination that ensures kibana visualizes without gaps but also publishes state changes that occur with a higher frequency than ES document publishing.

If I understand you correctly, this should be the current behavior. The difference between "Publish all" and "Publish state changes" is an additional step to include entities which did not have a state change.

"Publish state changes" will publish all state changes that happen within the interval. "Publish all" will publish all state changes that happen within the interval, and will also include an entry for any entities which did not undergo a state change during the interval (as tracked by the entity_counts variable in the async_do_publish function):

https://github.com/legrego/homeassistant-elasticsearch/blob/e803be02a999d9656a57021d047b7e6401da0f77/custom_components/elasticsearch/es_doc_publisher.py#L191-L207

strawgate commented 1 month ago

I believe that state changes are ignored by the listener if the mode is set to publish all?

https://github.com/legrego/homeassistant-elasticsearch/blob/e803be02a999d9656a57021d047b7e6401da0f77/custom_components%2Felasticsearch%2Fes_doc_publisher.py#L92-L115

Edit: nevermind I see it now

legrego commented 1 month ago

Edit: nevermind I see it now

For posterity, this is how the different publish modes are intended to work:

Publish Mode State Change Attribute Change No Change
All ✅ Publishes ✅ Publishes ✅ Publishes
Any Changes ✅ Publishes ✅ Publishes 🚫 Does not publish
State Changes ✅ Publishes 🚫 Does not publish 🚫 Does not publish

Edit: I've opened a PR to include this information in the README: https://github.com/legrego/homeassistant-elasticsearch/pull/258

legrego commented 1 month ago

@strawgate is there anything for us to do here, or are we good to close this issue?