vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
18.12k stars 1.6k forks source link

fix(axiom sink): Rebase sink on `http` sink and remove `elasticsearch` compatibility #21362

Closed darach closed 1 month ago

darach commented 1 month ago

Fixes #21292

The axiom sink has been rebased from elasticsearch to http per the issue. Deprecated headers removed in favor of aligning with the http request configuration scope, and the compression configuration now defaults to gzip.

The sink now conforms to Axiom's native ingest API:

https://axiom.co/docs/send-data/ingest#ingest-api

The _time field vector event data is no longer rejected. This was a source of confusion for vector axiom users in the past as elasticsearch semantics were imposed as the sink was based on the elasticsearch sink and hence field semantics and constraints.

For existing vector users, the _time field is documented and aligns well with vectors own timestamp mechanics:

https://axiom.co/docs/reference/field-restrictions#requirements-of-the-timestamp-field

If a @timestamp field is sent to the axiom sink it is now treated as a normal field. This is a non breaking change.

If a _time field is sent to Axiom it no longer raises a 4XX error, so this is effectively a fix.

As the axiom sink is now HTTP based the full wealth of vectors http configuration apparatus are now available to users of the axiom sink.

bits-bot commented 1 month ago

CLA assistant check
All committers have signed the CLA.

github-actions[bot] commented 1 month ago

Regression Detector Results

Run ID: 73b204e9-8458-4628-82ec-a6021eb66fa6 Metrics dashboard

Baseline: 73a03a7fb582d7706e5568d7ec3d50373a46e7b4 Comparison: e36654db7854bbf904d537c841d0cd363128a0e7

Performance changes are noted in the perf column of each table:

No significant changes in experiment optimization goals

Confidence level: 90.00% Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Experiments ignored for regressions

Regressions in experiments with settings containing `erratic: true` are ignored. | perf | experiment | goal | Δ mean % | Δ mean % CI | links | |------|-------------------|-------------------|----------|-----------------|-------| | ➖ | file_to_blackhole | egress throughput | -5.05 | [-11.51, +1.41] | |

Fine details of change detection per experiment

| perf | experiment | goal | Δ mean % | Δ mean % CI | links | |------|---------------------------------------------------|--------------------|----------|-----------------|-------| | ➖ | syslog_splunk_hec_logs | ingress throughput | +1.71 | [+1.57, +1.85] | | | ➖ | http_text_to_http_json | ingress throughput | +1.60 | [+1.46, +1.75] | | | ➖ | syslog_loki | ingress throughput | +1.54 | [+1.45, +1.64] | | | ➖ | syslog_log2metric_humio_metrics | ingress throughput | +0.54 | [+0.41, +0.67] | | | ➖ | http_to_http_acks | ingress throughput | +0.46 | [-0.77, +1.70] | | | ➖ | syslog_log2metric_tag_cardinality_limit_blackhole | ingress throughput | +0.30 | [+0.22, +0.38] | | | ➖ | http_to_http_noack | ingress throughput | +0.17 | [+0.08, +0.25] | | | ➖ | syslog_humio_logs | ingress throughput | +0.09 | [-0.05, +0.24] | | | ➖ | http_to_http_json | ingress throughput | +0.06 | [-0.01, +0.13] | | | ➖ | splunk_hec_indexer_ack_blackhole | ingress throughput | +0.02 | [-0.06, +0.10] | | | ➖ | splunk_hec_to_splunk_hec_logs_noack | ingress throughput | +0.00 | [-0.09, +0.10] | | | ➖ | splunk_hec_to_splunk_hec_logs_acks | ingress throughput | -0.01 | [-0.14, +0.11] | | | ➖ | syslog_log2metric_splunk_hec_metrics | ingress throughput | -0.34 | [-0.45, -0.23] | | | ➖ | otlp_grpc_to_blackhole | ingress throughput | -0.37 | [-0.49, -0.26] | | | ➖ | datadog_agent_remap_datadog_logs | ingress throughput | -0.54 | [-0.77, -0.30] | | | ➖ | http_to_s3 | ingress throughput | -0.61 | [-0.89, -0.34] | | | ➖ | fluent_elasticsearch | ingress throughput | -0.62 | [-1.11, -0.13] | | | ➖ | socket_to_socket_blackhole | ingress throughput | -1.37 | [-1.42, -1.31] | | | ➖ | datadog_agent_remap_blackhole | ingress throughput | -1.79 | [-1.90, -1.67] | | | ➖ | datadog_agent_remap_blackhole_acks | ingress throughput | -2.02 | [-2.12, -1.92] | | | ➖ | splunk_hec_route_s3 | ingress throughput | -2.03 | [-2.34, -1.73] | | | ➖ | otlp_http_to_blackhole | ingress throughput | -2.48 | [-2.62, -2.34] | | | ➖ | http_elasticsearch | ingress throughput | -2.55 | [-2.71, -2.38] | | | ➖ | datadog_agent_remap_datadog_logs_acks | ingress throughput | -2.89 | [-3.10, -2.69] | | | ➖ | syslog_regex_logs2metric_ddmetrics | ingress throughput | -2.91 | [-3.03, -2.79] | | | ➖ | file_to_blackhole | egress throughput | -5.05 | [-11.51, +1.41] | |

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI". For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true: 1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look. 2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that *if our statistical model is accurate*, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants. 3. Its configuration does not mark it "erratic".