Closed Masmiiadm closed 2 weeks ago
Pinging code owners:
exporter/clickhouse: @hanjm @dmitryax @Frapschen @SpencerTorres
See Adding Labels via Comments if you do not have permissions to add labels yourself.
I would like to see the full table DDL for otel_logs
via SHOW TABLE otel_logs
. I noticed you're using clustering and ReplicatedReplacingMergeTree
. It's possible that the logs are being removed as duplicates.
You can also validate the rest of your configuration (memory/batch limiting) by writing logs to a file or other exporter. I see the debug log line says 20 though, so it seems like this is indeed isolated to the ClickHouse exporter/server.
You can also check the system.query_log
table for the INSERT
's written_rows
or result_rows
. I believe this value would reflect the complete count of inserted rows rather than the final rows, because replacing merge tree can still contain duplicates if parts haven't merged yet.
@Masmiiadm let me know if this is still an issue. As noted in the comment above I think ReplicatedReplacingMergeTree
is causing similar rows to be combined, leading to the mismatch in the row counts.
Hello @SpencerTorres , Sorry for the late reply. Yes, it was indeed an issue with ReplicatedReplacingMergeTree . Thanks for your support
Component(s)
exporter/clickhouse
Description
I am using the OpenTelemetry Collector with the
filelog
receiver on a Kubernetes cluster to collect logs (see configuration below). The logs are then inserted into a ClickHouse server using the ClickHouse exporter.However, I am noticing significant data loss. To investigate further, I limited the collection to only one container. In the collector logs, I see that 20 records were inserted:
(2024-09-28T15:38:43.890Z debug clickhouseexporter@v0.110.0/exporter_logs.go:127 insert logs {"kind": "exporter", "data_type": "logs", "name": "clickhouse", **"records": 20**, "cost": "48.506672ms"})
But when I execute the command
SELECT count(*) FROM otel_logs
in ClickHouse, I only see 5 records. This means that 15 records have disappeared.Can someone help me identify the cause of this data loss?
Steps to reproduce
Configure the OpenTelemetry Collector with the filelog receiver and ClickHouse exporter. Limit log collection to a single container. Check logs in the OpenTelemetry Collector and ClickHouse for discrepancies. What is expected The number of logs inserted into ClickHouse should match the number of records shown in the OpenTelemetry Collector logs.
What is happening
The OpenTelemetry Collector logs show that 20 records were inserted, but ClickHouse only has 5 records.
Opentelemetry config file :