Open breathe opened 2 years ago
It appears that the log_to_metric
transform has no way to support deriving an arbitrary set of tags from a log event ...? The example above fails because the tags
configuration parameter needs to be a map ...
I've ended up working around this issue for now with a lua transform ... -- at the moment I only care about supporting gauge
metrics and so I've only implemented the transform for the gauge type (I suspect there is likely more sophisticated logic needed for some of the other metrics ...)
It would really nice if I could instead just point the logs_to_metric
transform at a field containing all the tags I want defined on the output metric ... To prevent high cardinality metric issues -- I'd use the tag_cardinality_limit
transform ...
transforms:
parsing:
type: "remap"
inputs:
- snowflake_s3
source: |-
. = parse_json!(string!(.message))
route_logs_by_type:
type: route
inputs:
- parsing
route:
counter: .TYPE == "counter"
histogram: .TYPE == "histogram"
gauge: .TYPE == "gauge"
set: .TYPE == "set"
summary: .TYPE == "summary"
log: .TYPE == "log"
remap_counter_log_to_metric:
type: log_to_metric
inputs:
- route_logs_by_type.counter
metrics:
- type: "counter"
field: "FIELD"
name: "{{NAME}}"
namespace: "{{NAMESPACE}}"
remap_histogram_log_to_metric:
type: log_to_metric
inputs:
- route_logs_by_type.histogram
metrics:
- type: "histogram"
field: "FIELD"
name: "{{NAME}}"
namespace: "{{NAMESPACE}}"
# {"name":"storage.table.retained_bytes.avg","namespace":"snowflake","tags":{"env":"dev","schema":"STAGING","service":"snowflake","table":"STORAGE_LOG"},"timestamp":"2022-10-05T21:09:16Z","kind":"absolute","gauge":{"value":0.0}}
remap_gauge_log_to_metric:
type: lua
version: "2"
inputs:
- route_logs_by_type.gauge
hooks:
process: |-
function (event, emit)
event.metric = {
name = event.log.NAME,
namespace = event.log.NAMESPACE,
kind = "absolute",
timestamp = os.date("!*t"),
tags = event.log.TAGS,
gauge = {
value = event.log.FIELD
}
}
event.log = nil
emit(event)
end
# {"name":"storage.table.retained_bytes.avg","namespace":"snowflake","timestamp":"2022-10-05T21:04:04.776646Z","kind":"absolute","gauge":{"value":0.0}}
# remap_gauge_log_to_metric:
# type: log_to_metric
# inputs:
# - route_logs_by_type.gauge
# metrics:
# - type: "gauge"
# field: "FIELD"
# name: "{{NAME}}"
# namespace: "{{NAMESPACE}}"
remap_set_log_to_metric:
type: log_to_metric
inputs:
- route_logs_by_type.set
metrics:
- type: "gauge"
field: "FIELD"
name: "{{NAME}}"
namespace: "{{NAMESPACE}}"
remap_summary_log_to_metric:
type: log_to_metric
inputs:
- route_logs_by_type.summary
metrics:
- type: "summary"
field: "FIELD"
name: "{{NAME}}"
namespace: "{{NAMESPACE}}"
remap_log_field_to_message:
type: remap
inputs:
- route_logs_by_type.log
source: |-
.message = .FIELD
del(.FIELD)
.ddsource = .NAMESPACE
del(.NAMESPACE)
.service = .TAGS.service
.ddtags = .TAGS
del(.TAGS)
del(.ddtags.service)
del(.TYPE)
.timestamp = now()
sinks:
datadog_metrics:
type: datadog_metrics
inputs:
- remap_counter_log_to_metric
- remap_histogram_log_to_metric
- remap_gauge_log_to_metric
- remap_set_log_to_metric
- remap_summary_log_to_metric
default_api_key: "${DD_API_KEY:?err}"
datadog_logs:
type: datadog_logs
inputs:
- remap_log_field_to_message
- route_logs_by_type._unmatched
default_api_key: "${DD_API_KEY:?err}"
console_output:
type: console
inputs:
- remap*
encoding:
codec: "text"
tests:
- name: "parsing -> parsing"
inputs:
- type: raw
insert_at: parsing
value: |-
{"FIELD":0,"NAME":"storage.table.retained_bytes.avg","NAMESPACE":"snowflake","TAGS":{"env":"dev","schema":"STAGING","service":"snowflake","table":"STORAGE_LOG"},"TYPE":"gauge"}
outputs:
- extract_from: parsing
conditions:
- type: vrl
source: |-
assert!(exists(.TYPE))
- name: "parsing -> route_logs_by_type.gauge"
inputs:
- type: raw
insert_at: parsing
value: |-
{"FIELD":0,"NAME":"storage.table.retained_bytes.avg","NAMESPACE":"snowflake","TAGS":{"env":"dev","schema":"STAGING","service":"snowflake","table":"STORAGE_LOG"},"TYPE":"gauge"}
outputs:
- extract_from: route_logs_by_type.gauge
conditions:
- type: vrl
source: |-
assert!(exists(.NAMESPACE))
- name: "parsing -> remap_gauge_log_to_metric"
inputs:
- type: raw
insert_at: parsing
value: |-
{"FIELD":0,"NAME":"storage.table.retained_bytes.avg","NAMESPACE":"snowflake","TAGS":{"env":"dev","schema":"STAGING","service":"snowflake","table":"STORAGE_LOG"},"TYPE":"gauge"}
outputs:
# {"name":"storage.table.retained_bytes.avg","namespace":"snowflake","tags":{"env":"dev","schema":"STAGING","service":"snowflake","table":"STORAGE_LOG"},"timestamp":"2022-10-05T21:09:16Z","kind":"absolute","gauge":{"value":0.0}}
- extract_from: remap_gauge_log_to_metric
conditions:
- type: vrl
source: |-
assert!(exists(.name))
assert!(exists(.namespace))
assert!(exists(.tags))
assert!(exists(.timestamp))
assert!(exists(.kind))
- name: "parsing -> remap_log_field_to_message"
inputs:
- type: raw
insert_at: parsing
value: |-
{"FIELD":"test logging","NAMESPACE":"snowflake","TAGS":{"env":"dev","service":"snowflake"},"TYPE":"log"}
outputs:
# {"ddsource":"snowflake","ddtags":{"env":"dev"},"message":"test logging","service":"snowflake","timestamp":"2022-10-05T21:48:30.823011Z"}
- extract_from: remap_log_field_to_message
conditions:
- type: vrl
source: |-
assert!(exists(.ddsource))
assert!(exists(.ddtags))
assert!(exists(.message))
assert!(exists(.service))
assert!(exists(.timestamp))
Hi @breathe !
Thanks for reporting this and apologies for the delay in review. You are correct that that page is misleading. We'll need to update that but you can also see https://github.com/vectordotdev/vector/blob/master/lib/codecs/tests/data/native_encoding/schema.cue for a more accurate description of the metric schema.
A note for the community
Problem
This page is unclear and inconsistent --
https://vector.dev/docs/about/under-the-hood/architecture/data-model/metric/
The image shows an 'example metric event'
But histogram is documented as being represented by required fields "buckets", "count", "sum"
"kind" is not documented anywhere but its marked as required and not shown in the example.
I assume there is probably other things out of date or wrong about this documentation ...
Example data to illustrate the metric data model and the various types supported would be nice ...
I'm building a new datasource that will output ndjson to s3 and I want to use vector with the aws_s3/sqs source to ship logs and metrics to various destination sinks. I control the output format and want to output in a generic format that vector can slurp up and ship as logs and metric to destination sinks -- but its surprisingly tricky given the current documentation ...
Given that my input will be logs I suppose I will need to use the logs_to_metric transformation ... given that I can customize my output source -- is it possible to use a totally generic transformation? Something like this maybe?
Configuration
Version
24.1
Debug Output
No response
Example Data
No response
Additional Context
No response
References
No response