SigNoz / opentelemetry-collector-contrib

Apache License 2.0
2 stars 4 forks source link

clickhousemetricsexporter panic on inital install of SigNoz #771

Closed nickb937 closed 2 years ago

nickb937 commented 2 years ago

Describe the bug

This crash periodically occurs on a new installation of SigNoz 0.8 when there is no data

2022-05-16T06:25:21.858Z    info    service/telemetry.go:95 Setting up own telemetry...
2022-05-16T06:25:21.859Z    info    service/telemetry.go:115    Serving Prometheus metrics  {"address": ":8888", "level": "basic", "service.instance.id": "91608d37-eea4-4139-8dd7-c3258df6798c", "service.version": "latest"}
2022-05-16T06:25:21.859Z    info    service/collector.go:229    Starting otelcontribcol...  {"Version": "latest", "NumCPU": 2}
2022-05-16T06:25:21.859Z    info    service/collector.go:124    Everything is ready. Begin running and processing data.
panic: runtime error: index out of range [-1]

goroutine 105 [running]:
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhousemetricsexporter.addSingleHistogramDataPoint({0xc00054c000}, {0x8}, {0x46df53}, {0x0, 0x0}, 0x5, 0xc000430320)
    /src/exporter/clickhousemetricsexporter/helper.go:397 +0x8bd
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhousemetricsexporter.(*PrwExporter).PushMetrics(0xc00054c000, {0x4fe1728, 0xc0006ac360}, {0x4fe1760})
    /src/exporter/clickhousemetricsexporter/exporter.go:168 +0xad7
go.opentelemetry.io/collector/exporter/exporterhelper.(*metricsRequest).export(0x4fe1760, {0x4fe1728, 0xc0006ac360})
    /go/pkg/mod/go.opentelemetry.io/collector@v0.43.0/exporter/exporterhelper/metrics.go:67 +0x34
go.opentelemetry.io/collector/exporter/exporterhelper.(*timeoutSender).send(0xc0009ee248, {0x504d4d0, 0xc00098c120})
    /go/pkg/mod/go.opentelemetry.io/collector@v0.43.0/exporter/exporterhelper/common.go:232 +0x96
go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send(0xc000a5e000, {0x504d4d0, 0xc00098c120})
    /go/pkg/mod/go.opentelemetry.io/collector@v0.43.0/exporter/exporterhelper/queued_retry.go:176 +0x5eb
go.opentelemetry.io/collector/exporter/exporterhelper.(*metricsSenderWithObservability).send(0xc00011c930, {0x504d4d0, 0xc00098c120})
    /go/pkg/mod/go.opentelemetry.io/collector@v0.43.0/exporter/exporterhelper/metrics.go:134 +0x88
go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1({0x43215a0, 0xc00098c120})
    /go/pkg/mod/go.opentelemetry.io/collector@v0.43.0/exporter/exporterhelper/queued_retry_inmemory.go:105 +0x5c
go.opentelemetry.io/collector/exporter/exporterhelper/internal.consumerFunc.consume(0xc000cabfa8, {0x43215a0, 0xc00098c120})
    /go/pkg/mod/go.opentelemetry.io/collector@v0.43.0/exporter/exporterhelper/internal/bounded_memory_queue.go:99 +0x2c
go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).StartConsumers.func2()
    /go/pkg/mod/go.opentelemetry.io/collector@v0.43.0/exporter/exporterhelper/internal/bounded_memory_queue.go:78 +0xd6
created by go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).StartConsumers
    /go/pkg/mod/go.opentelemetry.io/collector@v0.43.0/exporter/exporterhelper/internal/bounded_memory_queue.go:68 +0xa5

Failing line sugests pt.BucketCounts() is 0 when there is no data:

cumulativeCount += pt.BucketCounts()[len(pt.BucketCounts())-1]

Steps to reproduce

helm --namespace platform install signoz signoz/signoz --set clickhouseOperator.storage=100Gi --set frontend.image.tag="0.8.0" --set queryService.image.tag="0.8.0"

do nothing else

What version did you use? Version: 0.8.0

What config did you use? Config: (e.g. the yaml config file)

otel-collector-metrics-config.yaml:
----
exporters:
  clickhousemetricswrite:
    endpoint: tcp://${CLICKHOUSE_HOST}:${CLICKHOUSE_PORT}/?database=${CLICKHOUSE_DATABASE}&username=${CLICKHOUSE_USER}&password=${CLICKHOUSE_PASSWORD}
extensions:
  health_check: {}
  zpages: {}
processors:
  batch:
    send_batch_size: 1000
    timeout: 10s
receivers:
  otlp:
    protocols:
      grpc: null
      http: null
  prometheus:
    config:
      scrape_configs:
      - job_name: otel-collector
        scrape_interval: 30s
        static_configs:
        - targets:
          - signoz-otel-collector:8889
service:
  extensions:
  - health_check
  - zpages
  pipelines:
    metrics:
      exporters:
      - clickhousemetricswrite
      processors:
      - batch
      receivers:
      - otlp
      - prometheus

Environment OS: "Ubuntu 18.04" Azure Kubernetes

pranay01 commented 2 years ago

thanks for opening the issue @nickb937 We will have a look at it

makeavish commented 2 years ago

@srikanthccv

srikanthccv commented 2 years ago

Looking into this. Generally len(pt.BucketCounts()) should never be zero since there is at lease one bucket.