vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.42k stars 1.51k forks source link

Allow socket sink to accept metric events #20756

Closed petmit closed 1 month ago

petmit commented 2 months ago

A note for the community

Use Cases

I am sending log and metric events through a data diode (a network security device which allows network traffic to flow in only one direction) by using socket sink and source in UDP mode.

Attempted Solutions

To get metric events through the socket sink (which currently only allows log events) I am using Lua transforms to wrap/unwrap the events inside log events. This works fine but would not be necessary if the socket sink allowed metric events.

Lua script for wrapping on the sender side:

function (event, emit)
  emit {
    log = {
      source_type = "metric",
      metric = event.metric
    }
  }
end

Lua script for unwrapping on the receiver side:

function (event, emit)
  emit {
    metric = event.log.metric
  }
end

source_type attribute is used for routing only.

Proposal

Allow the socket sink to accept metric events.

References

No response

Version

vector 0.39.0 (x86_64-unknown-linux-gnu 73da9bb 2024-06-17 16:00:23.791735272)

jszwedko commented 2 months ago

I believe the socket source already accepts receiving metrics if a codec is configured that accepts them. Currently the only codec that can handle decoding input as metrics is native_json. You should be able to configure the native_json codec on your source and sink to send data between them. I'll close this out since I'm pretty sure that'll work, but let me know if it doesn't.

petmit commented 2 months ago

Can confirm that sending metric events to a socket source with native_json codec works as expected. Tested with the following command:

user@vector-host:~$ echo '{"metric":{"name":"promhttp_metric_handler_requests_total","tags":{"code":"503"},"timestamp":"2024-07-01T18:00:59.772382913Z","kind":"absolute","counter":{"value":0.0}}}' | nc -u 127.0.0.1 9000

Output from vector:

user@vector-host:~$ cat vector_socket_source.yaml
sources:
  remote:
    type: socket
    address: "127.0.0.1:9000"
    mode: udp
    decoding:
      codec: native_json

sinks:
  print:
    type: console
    inputs:
      - remote
    encoding:
      codec: json
user@vector-host:~$ vector -c  vector_socket_source.yaml
2024-07-01T18:27:53.905457Z  INFO vector::app: Log level is enabled. level="info"
2024-07-01T18:27:53.906949Z  INFO vector::app: Loading configs. paths=["vector_socket_source.yaml"]
2024-07-01T18:27:53.908606Z  INFO vector::topology::running: Running healthchecks.
2024-07-01T18:27:53.908674Z  INFO vector::topology::builder: Healthcheck passed.
2024-07-01T18:27:53.908692Z  INFO vector: Vector has started. debug="false" version="0.39.0" arch="x86_64" revision="73da9bb 2024-06-17 16:00:23.791735272"
2024-07-01T18:27:53.908717Z  INFO vector::app: API is disabled, enable by setting `api.enabled` to `true` and use commands like `vector top`.
2024-07-01T18:27:53.908820Z  INFO source{component_kind="source" component_id=remote component_type=socket}: vector::sources::socket::udp: Listening. address=127.0.0.1:9000
{"name":"promhttp_metric_handler_requests_total","tags":{"code":"503"},"timestamp":"2024-07-01T18:00:59.772382913Z","kind":"absolute","counter":{"value":0.0}}

However trying to start with the following configuration will fail:

user@vector-host:~$ cat vector_socket_sink.yaml 
sources:
  prometheus:
    type: prometheus_scrape
    endpoints:
      - http://localhost:9100/metrics

sinks:
  remote:
    type: socket
    inputs:
      - prometheus
    address: "127.0.0.1:9000"
    mode: udp
    encoding:
      codec: native_json
user@vector-host:~$ vector -c vector_socket_sink.yaml
2024-07-01T18:24:23.539734Z  INFO vector::app: Log level is enabled. level="info"
2024-07-01T18:24:23.542098Z  INFO vector::app: Loading configs. paths=["vector_socket_sink.yaml"]
2024-07-01T18:24:23.542943Z ERROR vector::cli: Configuration error. error=Data type mismatch between prometheus (Metric) and remote (Log)

Adding a transform will make it start but no metrics will be sent:

user@vector-host:~$ cat vector_socket_sink_with_transform.yaml
sources:
  prometheus:
    type: prometheus_scrape
    endpoints:
      - http://localhost:9100/metrics

transforms:
  remap_test:
    type: remap
    inputs:
      - prometheus
    source: .

sinks:
  remote:
    type: socket
    inputs:
      - remap_test
    address: "127.0.0.1:9000"
    mode: udp
    encoding:
      codec: native_json
user@vector-host:~$ vector -c vector_socket_sink_with_transform.yaml
2024-07-01T18:34:22.069986Z  INFO vector::app: Log level is enabled. level="info"
2024-07-01T18:34:22.071223Z  INFO vector::app: Loading configs. paths=["vector_socket_sink_with_transform.yaml"]
2024-07-01T18:34:22.073139Z  INFO vector::topology::running: Running healthchecks.
2024-07-01T18:34:22.073209Z  INFO vector: Vector has started. debug="false" version="0.39.0" arch="x86_64" revision="73da9bb 2024-06-17 16:00:23.791735272"
2024-07-01T18:34:22.073222Z  INFO vector::app: API is disabled, enable by setting `api.enabled` to `true` and use commands like `vector top`.
2024-07-01T18:34:22.073675Z  INFO vector::topology::builder: Healthcheck passed.
jszwedko commented 2 months ago

Thanks @petmit . It looks like the socket sink is incorrectly limiting the output. Let me reopen this.

jszwedko commented 2 months ago

If anyone is interested in taking a shot at fixing this I think we just need to remove the & DataType::Log from https://github.com/vectordotdev/vector/blob/e982f6679e9d1526d856efa462e728774b52cf34/src/sinks/socket.rs#L152 and add a test exercising sending of metrics.

nichtverstehen commented 1 month ago

I made the suggested changes in the PR https://github.com/vectordotdev/vector/pull/20930. @jszwedko PTAL