vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.48k stars 1.53k forks source link

Odd influxdb metric encoding #7455

Open jszwedko opened 3 years ago

jszwedko commented 3 years ago

Reported by user in discord: https://discord.com/channels/742820443487993987/746070591097798688/842534119246397456

User is seeing vector write metrics like:

host.memory_cached_bytes,collector=memory,host=b837d333c5be,metric_type=gauge value=614524000 1620945972244718300
host.memory_active_bytes,collector=memory,host=b837d333c5be,metric_type=gauge value=564312000 1620945972244718300
host.memory_used_bytes,collector=memory,host=b837d333c5be,metric_type=gauge value=3246316000 1620945972244718300

But cannot figure out how to do calculations of them when the field name, value, is the same for all metrics.

Telegraf writes metrics that look like:

diskio,host=seth.localdomain.com,name=sda2 reads=798000i,writes=2674662i,read_bytes=28144332800i,write_bytes=143501811712i,read_time=1121241i,io_time=1969500i,iops_in_progress=0i,write_time=4380131i,weighted_io_time=4382548i,merged_reads=120516i,merged_writes=130627i 1620945840000000000

This seems more correct to me at first blush, but I'm also not terribly familiar with influxdb.

juvenn commented 2 years ago

I suspect user should use influxdb_logs sink, instead of influxdb_metrics sink.

For the latter sink, it assumes a definite metric schema, which will be aggregated before encoded (definite schema too) to influxdb line protocol. It's not odd, but just for a different use case.

When trying to sink a event as is, use influxdb_logs sink instead.

E.g. given a event,

{host: "example.com",  name:"sda2", reads:798000i, writes: 2674662i }

it will be sinked to influxdb as is,

diskio,host=seth.localdomain.com,name=sda2 reads=798000i,writes=2674662i 1620945840000000000

See https://vector.dev/docs/reference/configuration/sinks/influxdb_logs/#mapping-log-fields