Open srikanthccv opened 1 month ago
There is work to move batching to exporter but It will take a while to get it to stable.
Just curious, why not use the batch processor?
The batch processor queues the item and returns immediately. Kafka receiver then marks the message as consumed, despite it not yet being written to storage. This creates a risk of data loss if the collector crashes or the storage backend becomes unavailable for an extended period. We should only mark messages as consumed after receiving confirmation from ClickHouse that the data has been successfully written.
Increasing the receive size doesn't help in reducing the inserts because the kafkareceiver sends each message to the next consumer. The ideal implementation should combine the individual message spans/logs/metrics into one big
ResourceSpans/Logs/Metrics
by appending them and sending them across the pipeline.https://github.com/SigNoz/signoz-otel-collector/blob/c6f81e77142e8e8f69e989ba6e1e735c9a6c50c5/receiver/signozkafkareceiver/kafka_receiver.go#L470-L502