open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.94k stars 1.32k forks source link

[Pdata] LogRecordCount call should not cause panics #10033

Open splunkericl opened 3 weeks ago

splunkericl commented 3 weeks ago

Describe the bug Intermittently, calling of LogRecordCount inside exporter batcher cause panic in the goroutine.

Steps to reproduce Unsure how this happens exactly. When there is data flowing, this happens from time to time

What did you expect to see? Data ingested successfully and exporter successfully

What did you see instead? goroutine is panicking

What version did you use? collector v0.96 with pdata v1.5

What config did you use?

Environment linux and mac

Additional context Panic stacktrace example:

panic({0x449e640?, 0x8083de0?})
runtime/panic.go:914 +0x21f
go.opentelemetry.io/collector/pdata/plog.ResourceLogsSlice.At(...)
go.opentelemetry.io/collector/pdata@v1.5.0/plog/generated_resourcelogsslice.go:56
go.opentelemetry.io/collector/pdata/plog.Logs.LogRecordCount({0xc009189050?, 0xc00535cda4?})
go.opentelemetry.io/collector/pdata@v1.5.0/plog/logs.go:48 +0x20
go.opentelemetry.io/collector/exporter/exporterhelper.(*logsRequest).ItemsCount(0xc0015df260?)
go.opentelemetry.io/collector/exporter@v0.96.0/exporterhelper/logs.go:63 +0x1d
go.opentelemetry.io/collector/exporter/exporterhelper.(*logsExporterWithObservability).send(0xc001929e90, {0x5739760?, 0xc00a8cfdb0?}, {0x5706360?, 0xc009189068?})
go.opentelemetry.io/collector/exporter@v0.96.0/exporterhelper/logs.go:156 +0x98
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseRequestSender).send(0x0?, {0x5739760?, 0xc00a8cfdb0?}, {0x5706360?, 0xc009189068?})
go.opentelemetry.io/collector/exporter@v0.96.0/exporterhelper/common.go:35 +0x30
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseExporter).send(0xc0013bf7c0, {0x5739760?, 0xc00a8cfdb0?}, {0x5706360?, 0xc009189068?})
go.opentelemetry.io/collector/exporter@v0.96.0/exporterhelper/common.go:211 +0x66
go.opentelemetry.io/collector/exporter/exporterhelper.NewLogsRequestExporter.func1({0x5739760, 0xc00a8cfdb0}, {0xc009189050?, 0xc00535cda4?})
go.opentelemetry.io/collector/exporter@v0.96.0/exporterhelper/logs.go:131 +0x325
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs(...)
go.opentelemetry.io/collector/consumer@v0.96.0/logs.go:25
github.com/open-telemetry/opentelemetry-collector-contrib/processor/routingprocessor.(*logProcessor).route(0xc001922690, {0x5739760, 0xc00a8cfdb0}, {0xc009188f90?, 0xc00535ccf4?})
github.com/open-telemetry/opentelemetry-collector-contrib/processor/routingprocessor@v0.96.0/logs.go:139 +0x45f
github.com/open-telemetry/opentelemetry-collector-contrib/processor/routingprocessor.(*logProcessor).ConsumeLogs(0xc001959a70?, {0x5739760?, 0xc00a8cfdb0?}, {0xc009188f90?, 0xc00535ccf4?})
github.com/open-telemetry/opentelemetry-collector-contrib/processor/routingprocessor@v0.96.0/logs.go:79 +0x32
splunkericl commented 3 weeks ago

We are attempting to upgrade to collector v0.99 and pdata v1.6 to see if it fixes anything. But checking through the release notes i don't see any bug fixes around these areas

splunkericl commented 1 week ago

we are still seeing it with collector v0.99 and pdata v1.6

go.opentelemetry.io/collector/pdata/plog.ResourceLogsSlice.At(...)
  go.opentelemetry.io/collector/pdata@v1.6.0/plog/generated_resourcelogsslice.go:56
go.opentelemetry.io/collector/pdata/plog.Logs.LogRecordCount({0xc08f2bbde8?, 0xc08f4741cc?})
  go.opentelemetry.io/collector/pdata@v1.6.0/plog/logs.go:48 +0x20
go.opentelemetry.io/collector/exporter/exporterhelper.(*logsRequest).ItemsCount(0xc0018702a0?)
  go.opentelemetry.io/collector/exporter@v0.99.0/exporterhelper/logs.go:63 +0x1d
go.opentelemetry.io/collector/exporter/exporterhelper.(*logsExporterWithObservability).send(0xc0017e7a10, {0x59ad4e0?, 0xc089e379f0?}, {0x5970820?, 0xc08f2bbe00?})
  go.opentelemetry.io/collector/exporter@v0.99.0/exporterhelper/logs.go:159 +0x98
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseRequestSender).send(0xc04444e968?, {0x59ad4e0?, 0xc089e379f0?}, {0x5970820?, 0xc08f2bbe00?})
  go.opentelemetry.io/collector/exporter@v0.99.0/exporterhelper/common.go:37 +0x30
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseRequestSender).send(0x0?, {0x59ad4e0?, 0xc089e379f0?}, {0x5970820?, 0xc08f2bbe00?})
  go.opentelemetry.io/collector/exporter@v0.99.0/exporterhelper/common.go:37 +0x30
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseExporter).send(0xc0015538c0, {0x59ad4e0?, 0xc089e379f0?}, {0x5970820?, 0xc08f2bbe00?})
  go.opentelemetry.io/collector/exporter@v0.99.0/exporterhelper/common.go:290 +0x66
go.opentelemetry.io/collector/exporter/exporterhelper.NewLogsRequestExporter.func1({0x59ad4e0, 0xc089e379f0}, {0xc08f2bbde8?, 0xc08f4741cc?})
  go.opentelemetry.io/collector/exporter@v0.99.0/exporterhelper/logs.go:134 +0x325
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs(...)
  go.opentelemetry.io/collector/consumer@v0.99.0/logs.go:25
github.com/open-telemetry/opentelemetry-collector-contrib/processor/routingprocessor.(*logProcessor).route(0xc0019de1e0, {0x59ad4e0, 0xc089e379f0}, {0xc08f2bbd58?, 0xc08f4740c8?})
  github.com/open-telemetry/opentelemetry-collector-contrib/processor/routingprocessor@v0.99.0/logs.go:148 +0x45f
github.com/open-telemetry/opentelemetry-collector-contrib/processor/routingprocessor.(*logProcessor).ConsumeLogs(0xc001921530?, {0x59ad4e0?, 0xc089e379f0?}, {0xc08f2bbd58?, 0xc08f4740c8?})
  github.com/open-telemetry/opentelemetry-collector-contrib/processor/routingprocessor@v0.99.0/logs.go:88 +0x32
cd.splunkdev.com/data-availability/acies/otel-collector/processor/splunkaciesprocessor.(*aciesProcessor).ConsumeLogs(0xc001151d80, {0x59ad4e0, 0xc089e379f0}, {0xc08f2bbd40?, 0xc08f4740c0?})
  cd.splunkdev.com/data-availability/acies/otel-collector/processor/splunkaciesprocessor/processor.go:153 +0x109
go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs(...)
  go.opentelemetry.io/collector/consumer@v0.99.0/logs.go:25
go.opentelemetry.io/collector/internal/fanoutconsumer.(*logsConsumer).ConsumeLogs(0xc0019215f0, {0x59ad4e0, 0xc089e379f0}, {0xc08f2bbd40?, 0xc08f4740c0?})
  go.opentelemetry.io/collector@v0.99.0/internal/fanoutconsumer/logs.go:62 +0x1fe