Closed legrego closed 5 years ago
In general I'm :+1: on this as 30gb usually is a good size for a shard with this type of data. I'm wondering though how long it will take to reach 30gb. If it takes a year, then it might make things more complicated, i.e. if a mapping change is necessary
That's a good point...I imagine it would take quite some time to hit the 30gb mark, for most installations. For my own indices, I'm seeing ~150mb per 1 million documents
pinging @dsztykman, since you had the original suggestion for 30gb. Do you have any thoughts on this?
Agreed it's a bit of an issue, we should maybe think of a new indices names with versioning included like hass-events-v1-XXX
and then whenever we change the mapping we create a new version and with aliases we can manage this way.
Otherwise we're going to end up with too many shards and too little data and the performance is going to suffer in the long term.
The question is how do we detect a change in mapping from home assistant directly?
Like adding a new device ?
I like the idea of versioning the index names.
The question is how do we detect a change in mapping from home assistant directly? Like adding a new device ?
My goal for the index mapping is to be device agnostic. It shouldn't care which devices are registered, or how many devices exist. I think the only times the mapping should change are:
1) Defects in the mapping. I've encountered this as more people use the plugin with various configurations. See also #32 2) ES version compatability: if supporting the next major version requires changes to the mapping 3) Enhancement requests
So essentially, I only expect the mapping to change if this plugin requires it, or if Elasticsearch requires it. The individual installations shouldn't have any bearing on the mapping, so we should only have to bump the version as a result of a changes to this plugin.
Make sense so essentially change the version and the mapping when we receive an error from ES
A bit off-topic for this PR, but if you're interested in seeing/reviewing the versioning idea, I have a PR up for #32 which incorporates this: https://github.com/legrego/homeassistant-elasticsearch/pull/40
The current defaults for index rollovers are too aggressive. This results in indices that are too small. The default
rollover_size
should be set to30gb
, and bothrollover_age
androllover_docs
should be initially unset.