Use existing software like filebeat and write to Kafka, read the messages with Clickhouse and have a API that converts queries from LogQL to Clickhouse SQL.
As we dont want to configure the whole ECS inside clickhouse we would potentially store just the most queried fields as seperate column and the rest inside a Map.
CREATE TABLE libbeat_data
(
`@timestamp` DateTime64(3) CODEC(Delta(8), ZSTD(1)),
`message` String CODEC(ZSTD(1)),
`fields` Array(Map(LowCardinality(String), String)) CODEC(ZSTD(1)),
INDEX idx_fields_key mapKeys(fields) TYPE bloom_filter(0.01) GRANULARITY 1,
INDEX idx_fields_value mapValues(fields) TYPE bloom_filter(0.01) GRANULARITY 1
)
ENGINE = MergeTree
PARTITION BY toDate(@timetstamp)
ORDER BY (toUnixTimestamp(@timetstamp))
TTL toDateTime(@timetstamp) + toIntervalDay(3)
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1
Idea
Use existing software like filebeat and write to Kafka, read the messages with Clickhouse and have a API that converts queries from LogQL to Clickhouse SQL.
Docs
Filebeat Kafka Docs
Potential Table layout:
As we dont want to configure the whole ECS inside clickhouse we would potentially store just the most queried fields as seperate column and the rest inside a Map.