Closed lewismc closed 3 years ago
Hi @lewismc! Thanks for reaching out! To your questions:
The behavior I would expect to see is that the full JSON would be sent to Splunk, with the example you provided included as the event
property. That event
field becomes the raw text of the event, with the rest of the fields controlling event metadata. Then when searching the data, you would would see correct event metadata (time/source/sourcetype/index) but only the event
itself would be included in the raw text.
Thanks for the quick response @emiller42. I'm collecting all of the info to respond thoroughly.
Hi @emiller42 here's an update. It's important for me to state that we are now on Splunk v8, in which metrics data has changed a bit. With that in mind, it may be the case that the splunk-statsd-backend needs to be updated to reflect those changes.
In response to the actual question, that is the _raw
payload after the data has been received and processed by a heavy forwarder. Metadata fields are being processed just fine, it’s the index-time metrics field extraction that is the issue. We are seeing the following error:
INDEXER has the following message: The metric event is not properly structured, source=statsd, sourcetype=_json, host=OMITTED, index=metrics_data. Metric event data without a metric name and preoperly formated numerical values are invalid and cannot be indexed. Ensure the input metric data is not malformed, have one or more keys of the form "metric_name:
" (e.g..."metric_name:cpu.idle") with corresponding floating point values.
Let me know if this helps.
That detail definitely helps! It looks like you're attempting to ingest the data as Metrics, which IIRC wasn't yet available when I wrote this backend. I'll see if I can get this playing nicely with Splunk Metrics over the weekend. (likely behind a config toggle so it isn't breaking for anyone else who might be using this backend)
I also notice that Splunk appears to directly support StatsD. Is this something you've looked into already?
It looks like you're attempting to ingest the data as Metrics,
Yes
I also notice that Splunk appears to directly support StatsD. Is this something you've looked into already?
Actually no it isn't. I'm going to have a look at that tomorrow.
We only started work on this at the beginning of the week so still learning more. Also, we are the first team (in work) to attempt to ingest StatsD metrics into Splunk... so we are kinda breaking ground so to speak.
Hi @emiller42 according to my Splunk admin, the reason we haven’t gone down the route of the direct input for statsd is that deploying an entire cluster-wide change to receive data like that is a heavy effort. We are attempting to get this input working using Cribl. For now though, deploying the cluster-wide change for Splunk to receive direct StatsD output is not an option. Quick question, have you started working on anything yet? If not, I am happy to dive in and get my hands dirty :) Thank you
I've been catching up on the implementation of metrics via HEC.
Based on the above, I think I would send a single payload for all metrics of a given metric type. (with a field indicating the metric_type
) This means that if all four supported metrics types are used, there would be four POST payloads per flush interval. (Example payloads at the bottom)
Does this look like it would suit your needs?
{
"time": 1485314310.000,
"event": "metric",
"source": "<from_config>",
"sourcetype": "<from_config>",
"host": "<from_config>",
"fields": {
"metric_type": "gauge",
// Metric names would be passthrough from input
"metric_name:foo.cpu.user": 0.0,
"metric_name:foo.cpu.system": 0.0,
"metric_name:foo.cpu.idle": 0.0,
"metric_name:bar.cpu.user": 0.0,
"metric_name:bar.cpu.system": 0.0,
"metric_name:bar.cpu.idle": 0.0,
// ... etc
}
}
{
"time": 1485314310.000,
"event": "metric",
"source": "<from_config>",
"sourcetype": "<from_config>",
"host": "<from_config>",
"fields": {
"metric_type": "set",
// Metric names would be passthrough from input
"metric_name:foo.uniques": 98,
"metric_name:bar.uniques": 127,
// ... etc.
}
}
Each counter would result in two metrics. <metric_name>.count
and <metric_name>.rate
{
"time": 1485314310.000,
"event": "metric",
"source": "<from_config>",
"sourcetype": "<from_config>",
"host": "<from_config>",
"fields": {
"metric_type": "counter",
"metric_name:foo.requests.count": 17046,
"metric_name:foo.requests.rate": 1704.6,
"metric_name:bar.requests.count": 12544,
"metric_name:bar.requests.rate": 1254.4,
// ... etc.
}
}
These would output a full set of metrics derived from the input metric name. Example using metric name foo.duration
{
"time": 1485314310.000,
"event": "metric",
"source": "<from_config>",
"sourcetype": "<from_config>",
"host": "<from_config>",
"fields": {
"metric_type": "timer",
"metric_name:foo.duration.count_90": 304,
"metric_name:foo.duration.mean_90": 143.07236842105263,
"metric_name:foo.duration.upper_90": 280,
"metric_name:foo.duration.sum_90": 43494,
"metric_name:foo.duration.sum_squares_90": 8083406,
"metric_name:foo.duration.std": 86.5952973729948,
"metric_name:foo.duration.upper": 300,
"metric_name:foo.duration.lower": 1,
"metric_name:foo.duration.count": 338,
"metric_name:foo.duration.count_ps": 33.8,
"metric_name:foo.duration.sum": 53402,
"metric_name:foo.duration.sum_squares": 10971776,
"metric_name:foo.duration.mean": 157.9940828402367,
"metric_name:foo.duration.median": 157.5,
// histogram fields
"metric_name:foo.duration.histogram.bin_50": 49,
"metric_name:foo.duration.histogram.bin_100": 45,
"metric_name:foo.duration.histogram.bin_150": 66,
"metric_name:foo.duration.histogram.bin_200": 60,
"metric_name:foo.duration.histogram.bin_inf": 118,
// ...etc.
}
}
@emiller42 the above looks great. I assume that this is now built into master branch? If you can confirm then we can test out master branch with our StatsD server.
Not yet, I've got some local changes to implement it. Doing some cleanup and I'll have you take a look at the PR when ready.
this has been pushed and is available in v0.2.0.
👍 Thank you @emiller42
Hi @emiller42, we recently picked up this backend and have been experimenting sending events from Apache Airflow --> StatsD --> splunk-statsd-backend --> Splunk. It looks like the JSON events we are sending look like
... whereas the statsd-splunk-backend documentation states that the JSON should look like
Questions
If this backend needs to be augmented to accommodate new versions of Splunk, I am happy to do that with a PR. I just want to know if this project is alive though! Thank you