Open hagen1778 opened 1 year ago
Additionally to out-of-order labels, Thanos, Cortex, Mimir and Prometheus may reject accepting samples from vmagent
with out of order timestamps. This is because vmagent
writes data to the configured remote storage via multiple concurrent connections. Samples for the same time series may be sent concurrently via multiple such connections. It is possible that the sample with newer timestamp is delivered faster than the sample with older timestamp. This will result in out of order samples
error at Prometheus, Thanos, Cortex and Mimir. This can be fixed by running vmagent
with -remoteWrite.queues=1
command-line flag. This flag instructs vmagent
to use only a single connection to the configured -remoteWrite.url
for sending the data. See vmagent troubleshooting for details.
I'm running the same issue, with the flag -sortLabels
set... This issue is nothing to do with out of order samples
@valyala , even because I have set --tsdb.out-of-order.time-window
on thanos receiver.
This issue is regarding out of order labels
which does not make sense with the -sortLabels
flag set. I started to get the error Error on series with out-of-order labels
on version v1.96.0
, previously on version v1.89.1
the error did not occur and
vmagent could write all the metrics to thanos receiver successfully
I figured it out that the --remoteWrite.label
are now being added at the end of the metric labels, without being sorted with the original labels. Why this behavior change? It can be reverted?
on version v1.93.0
, with the -sortLabels
and -remoteWrite.label
flags set is working as expected. On version v1.93.1
the out of order labels
error start occurring, so the -sortLabels
stopped to work as intended, as it started to not include the remoteWrite labels in the sort
@Amper @valyala I think the issue is related to the following change https://github.com/VictoriaMetrics/VictoriaMetrics/commit/a27c2f37731986f4bf6738404bb6388b1f42ffde
Shall we sort labels once again if -remoteWrite.label was applied?
Shall we sort labels once again if -remoteWrite.label was applied?
It is better from performance PoV to sort labels only once just before sending them to the remote storage.
I am experiencing this in v1.99 using docker-compose locally. What is the solution here?
vm-agent | 2024-04-03T14:57:32.117Z error VictoriaMetrics/app/vmagent/remotewrite/client.go:444 sending a block with size 21432 bytes to "1:secret-url" was rejected (skipping the block): status code 409; response body: store locally for endpoint : add 702 series: out of order labels
thanos-receiver | ts=2024-04-03T14:57:32.11231392Z caller=writer.go:238 level=info component=receive component=receive-writer tenant=default-tenant msg="Error on series with out-of-order labels" numDropped=702
vm agent + exporter configuration
version: '3.5'
services:
node_exporter:
image: quay.io/prometheus/node-exporter:v1.7.0
container_name: node-exporter
command:
- --path.rootfs=/host
# Metrics endpoint at http://localhost:9100/metrics
- --web.listen-address=:9100
# Enable systemd collector to monitor autossh, postgres etc.
- --collector.systemd
- --collector.systemd.unit-include=^(autossh|postgresql)
# Enable wifi collector to monitor simcard etc.
- --collector.wifi
security_opt:
# Required to access systemd
- apparmor:unconfined
network_mode: host
pid: host
restart: unless-stopped
volumes:
- /:/host:ro,rslave
- /var/run/dbus/system_bus_socket:/var/run/dbus/system_bus_socket
vm_agent:
image: victoriametrics/vmagent:v1.99.0
container_name: vm-agent
command:
- -promscrape.config=/vmagent.config.yml
# This should be http://<edge-server-ip>:10908/api/v1/receive
- -remoteWrite.url=http://localhost:10908/api/v1/receive
# Set persistent volume to store data
- -remoteWrite.tmpDataPath=/vmagent/data
# Set unique labels for device
- -remoteWrite.label=device=test_1
# Needed for thanos to accept the data
- -sortLabels
network_mode: host
restart: unless-stopped
volumes:
- ./vmagent.config.yml:/vmagent.config.yml:ro
- ./vmagent/data:/vmagent/data
Thanos receiver config:
version: '3.5'
services:
thanos_receiver:
image: quay.io/thanos/thanos:v0.34.1
container_name: thanos-receiver
command: >
receive
--tsdb.path="/thanos/data"
--tsdb.retention=14d
--label=stage='"production"'
--label=cluster='"staging"'
--label=region='"eu-west-1"'
--label=receive_replica='"0"'
--grpc-address="0.0.0.0:10907"
--http-address="0.0.0.0:10909"
--remote-write.address="0.0.0.0:10908"
--objstore.config-file="/bucket.yml"
network_mode: host
pid: host
restart: unless-stopped
volumes:
- ./thanos/data:/thanos/data:rw
- ./bucket.yml:/bucket.yml:ro
Nvm, the solution is to not use the - -remoteWrite.label flag at all... Works fine with global.external_labels
in promscrape config
Is this issue being worked on? I notice that https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5874 was just closed and not merged.
I am using the workaround in the promscrape config, but that's obviously brittle. Would you accept a PR here / do you know why https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5874 was closed?
Describe the bug
While using vmagent to push metrics to thanos-receiver, the latter outputs the following error:
To Reproduce
"I tried to send metrics from VMAgent to Thanos Receive as I wanna do some performance benchmarking between usuing VMAgent and Prometheus. In Thanos Receive I get the following error messages, when trying to send metrics from VMAgent via remote write URL"
Additional information
The labels sort requirement was enforced by Thanos receiver here. Prometheus doesn't have this enforced yet.
vmagent doesn't sort labels by default, but this behavior can be changed by passing
-sortLabels
command-line flag:The flag was introduced in 1.58: