Open gkramer opened 2 years ago
If you look on the Cassandra side:
Inside your container, does carbon restarts by itself?
Apologies for the delay in coming back to you!
I've rebuilt the cache container to only run carbon cache. Previously, it was running statds+carbon+etc, and this was all under supervisord, or similar. The container now runs carbon exclusively.
At first, and under low load, there were no metric drop-outs at all. We were shipping all metrics for spark, and it was bulletproof. As soon as we started shipping more metrics from other services, we began to see drop-outs of 1-2 minutes. across multiple metrics. Another interesting observation is that metrics appear to disappear at times - I'm not sure if they are being overwritten by null values? What I can tell you is that metrics are being fed into now what is a dedicated carbon ingress, and being inspected from another graphite endpoint, so whisper data is not a thing.
I've made multiple tweaks to the configs, but I'm at a bit of a loss as to how to eradicate the intermittent data loss.
Any help would be GREATLY appreciated!
TIA!
Hey guys,
Wondering if someone could assist with an issue I'm having with BigGraphite [BG]. It currently receives a large number of metrics, but appears to drop a noticable proportion randomly... this was highlighted when looking at metrics from Apache Spark, which has frequent gaps per hour (of one minute each).
Infrastructure Setup:
I can see traffic coming in to the interface (tcpdump/tcpflow), and can see logs to bg-carbon.log with references to 'cache query', but almost no datapoint logs for spark metrics.
Any assistance in troubleshooting would be greatly appreciated!