Open celestialorb opened 3 years ago
👋 hey @celestialorb, thanks for raising this issue!
I looked into it and indeed there was some missing logic to handle this scenario. I believe to have managed to sort it out with eac8176
I will give it a try later on.
I'm using
v0.5.2
of the exporter which I've recently hooked up to Redis for an HA setup, but I've noticed that the exporter itself doesn't seem to repull data if it's killed or rolled during a task.I can reproduce the issue by starting with a fresh Redis (new cluster or by issuing a
FLUSHALL
), starting the exporter, then killing or rolling it during the initial metrics pull of the exporter. The exporter itself is running just a single replica Kubernetes deployment in this debugging configuration.The exporter itself will then seemingly forever continue to report that its task is already queued, and will thus skip it.
I assume the queued task will expire at some point and thus recover, however I left it running for days and it never seemed to do so. For debugging purposes I have greatly increased the pull and garbage collection frequency to attempt to ascertain what effect(s) they may have on the exporter (down to pulling every ten seconds, and GC'ing every 30 seconds).
Performing another
FLUSHALL
in Redis causes it to start pulling again as it clears the queued tasks.Is there any configuration I can change in order to get this to recover faster under these circumstances? Am I missing something? Does the exporter mark the tasks it owns as failed / up for reprocessing when it receives a termination signal?