vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.06k stars 1.48k forks source link

Expose custom variables via internal metrics #20755

Open johnhtodd opened 2 weeks ago

johnhtodd commented 2 weeks ago

A note for the community

Use Cases

I set several environment values in each Vector instance, which then regulate various actions within VRL. For instance, I have a sample rate that is dependent on system available CPUs. In order to understand this variable currently when graphing data results out of Vector aggregations in Prometheus, I need to import these values via a third-party path (currently: node montitors with text file scrape.) It seems that this is a very roundabout way to get this data, when it should be in Vector's own Prometheus output. That way I have certainty that the data is correct and timely, and I can eliminate significant external tooling. These are static values, mostly based on environment variables that are customized to this exact instance of Vector but yet are different for every Vector instance we operate.

My use case is only Prometheus-based, but I can imagine others have more extensive uses for this type of data exposure.

Attempted Solutions

I have used node text scraper for this, but that requires additional tooling and scripts to double up my data.

Proposal

I would suggest that an option exist in the internal metrics to include arbitrary metrics defined once as part of the internal_metrics definition:

Partial contents of /etc/default/vector (environment variables):

# For this location, what is the sample rate used? 
VECTORPOPSAMPLERATE=20

Partial contents of vector.yml:

sources:
  my_source_id:
    type: internal_metrics
       custom:
          samplerate: "${VECTORPOPSAMPLERATE}"
          blah: 4.4

sinks:
  vector-prometheus:
    type: "prometheus_exporter"
    inputs: ["vector_metrics"]
    address: "[::]:9300"

Partial output of "curl http://127.0.0.1:9300":

# HELP vector_custom Vector custom values
# TYPE vector_custom gauge
vector_custom{host="dev01.ams",custom_tag="samplerate"} 20 1719601893267
vector_custom{host="dev01.ams",custom_tag="blah"} 4.4 1719601893267

References

No response

Version

vector 0.39.0 (x86_64-unknown-linux-gnu)

jszwedko commented 2 weeks ago

Neat idea, thanks @johnhtodd . Rather than modifying internal_metrics, we could consider also introducing a new source like static_metrics. Maybe the config would look something like:

sources:
  my_source_id:
    type: static_metrics
    interval_secs: 2
    metrics:
       - name: custom
         value: "${VECTORPOPSAMPLERATE}"
         tags:
           custom_tag: samplerate
       - name: custom
         value: 4.4
         tags:
           custom_tag: blah

I think this would result in better separation of concerns. I'm curious what you think of this UX though.

johnhtodd commented 2 weeks ago

I have no objection to that method - I'm ambivalent about structure of configuration as long as I can figure out how I need to write the syntax.

What does the "interval_secs" do in the example you provide?

Upon consideration of my initial suggestion, I have an update. It may be useful for tag names to be both dynamic (able to be specified with variables themselves) and also to have a possibility of multiple tags in a metric. Perhaps the first part of that request is sort-of handled automatically by Vector's ability to expand configurations with variable names. The second part, multiple tags, seems to just be writing the method to be iterative instead of a singleton.

jszwedko commented 2 weeks ago

What does the "interval_secs" do in the example you provide?

This would be the interval in which the metric is emitted.

I realized my example wasn't quite right. I edited it in-place, but I think it'd be more like:

sources:
  my_source_id:
    type: static_metrics
    interval_secs: 2
    metrics:
       - name: custom
         value: "${VECTORPOPSAMPLERATE}"
         tags:
           custom_tag: samplerate
       - name: custom
         value: 4.4
         tags:
           custom_tag: blah

(the difference is I forgot value and the tags were incorrect)

I think this would support having multiple tags. Yes, I believe the first part could be handled via environment variable interpolation like:

tags:
  "${FOO}": "${BAR}"