ccin2p3 / samplerr

Round robin timeseries middleware based on riemann and elasticsearch
Eclipse Public License 1.0
15 stars 4 forks source link

purge happens too early #17

Open smortex opened 4 years ago

smortex commented 4 years ago

When purging indices, the oldest index removal seems to occur too early: it's currently 2020-04-12 23:49:28 UTC, and samplerr is configured to keep 3 days of metrics at a daily resolution, I therefore expect the daily indices to be:

Yet, I only see:

health status index                uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .samplerr-2020.04.11 rz7tfYB1S4aug-9aKbWYEA   1   0   11269010            0      1.2gb          1.2gb
green  open   .samplerr-2020.04.12 Z1L6rfUoTtGt4AWhPQpF5Q   1   0   11203988            0      1.2gb          1.2gb

According to the log file (Europe/Paris time zone), the .samplerr-2020.04.10 index was removed at 2020-04-12 21:42:27 UTC:

INFO [2020-04-12 23:42:27,337] Thread-7 - riemann.plugin.samplerr - delete index .samplerr-2020.04.10

I would expect .samplerr-2020.04.10 not to be removed before 2020-03-13 00:00:00 UTC in such conditions.

faxm0dem commented 4 years ago

Hi smortex, sorry for this inconvenience I'll have a look at this ASAP. How's your clojure-fu doing? ;-)

smortex commented 4 years ago

No worries, I just stumbled on this and wanted to log it somewhere so that it's not forgotten.

My clojure-fu still can't find any time to start to exist… Yet, I am working on the idea we discusses about aliases with filters to avoid different resolutions to be interleaved when accessing the data. This is currently under development but I deployed it in prod and am currently enjoying it: https://github.com/opus-codium/maintain-samplerr-aliases

I am secretly refactoring the code a lot in order to have a clean separation between the concepts because I hope to be able to integrate this in samplerr at some point in the future (when I get this clojure-fu :trollface:).