Closed sopel closed 10 years ago
We briefly discussed this in yesterday's hangout already and @mrdavidlaing mentioned/maintains the built in _ttl as an alternative:
Architecturally I’m still in favour of storing a _ttl at the document level; and then letting elasticsearch clean itself up.
Specifying how long you want the document to exist at the point of shipping it in makes a lot of sense to me
With a default expiry applied to those documents which haven’t chosen an explicit ttl
We should revisit this accordingly.
It feels like a very similar decision to including the timezone when shipping in a log with the log, rather than having something external that goes and adds the timezone to a log after it has been shipped.
Anyway, the elasticsearch _ttl functionality is described here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-ttl-field.html
@dpb587 has convinced me that:
Thus, my opinion has been changed to favour the curator approach over the document _ttl approach.
:information_source: Curator seems to receive the expected maturity treatments and is meanwhile available as a PyPi package at least, thereby easing consistent deployment.
I've used curator -d 29
successfully on the main cluster the past two Tuesdays now. We could make it a Jenkins job, but like our other tasks it requires an SSH tunnel which requires a lot of extra bootstrapping to what should be a very simple one-liner. I wish there was another way...
Add it as a cron job on the elasticsearch master node...
This issue is assigned to me and, after reviewing it, I'm closing it because this problem is solved by curator and is something we won't be pursuing within this repository since migrating to logsearch/logsearch-boshrelease.
We used to facilitate the logsearch-purge-bot to prune indices in a regular fashion - this project is going to be deprecated though, hence we need to find other options down the road (right now manual purging via ElasticHQ is trivial, but this obviously neither scales nor monitors properly).
As mentioned in https://github.com/cityindex/logsearch-purge-bot/issues/8 already, the Elasticsearch curator might offer everything we need and more eventually: