Closed deejay1 closed 6 years ago
that's really odd behaviour, sending doubly is very unexpected.
Is the relay reporting any connection issues? The only reason I can think of now why this happens is if writing the metrics to the remote graphite server fails for some reason.
During this time no connection errors were logged, it was straight from a tcpdump from the relay. One potential point we're investigating right now is bonding issues on the receiving hosts or something similar because we route the metrics back to the load balanced relay pool, which are then forwarded to two graphite clusters - one with fnv1a_ch replication (where we're seeing the gaps) and one "normal" which seems to be fine
closing this issue for now, please reopen if the problem persists
Seems like the relay misses/duplicates some datapoints when sending it's statistics, which results in holes in the data. A quick tcpdump resulted in:
Stat configuration is standard: