netobserv / flowlogs-pipeline

Transform flow logs into metrics
Apache License 2.0
73 stars 23 forks source link

Error: concurrent map iteration and map write #234

Closed jotak closed 2 years ago

jotak commented 2 years ago

I got this race today:

fatal error: concurrent map iteration and map write

goroutine 149 [running]:
runtime.throw({0x1c07bf2, 0x462959})
    /usr/local/go/src/runtime/panic.go:1198 +0x71 fp=0xc000032f48 sp=0xc000032f18 pc=0x4378b1
runtime.mapiternext(0x14)
    /usr/local/go/src/runtime/map.go:858 +0x4eb fp=0xc000032fb8 sp=0xc000032f48 pc=0x41168b
reflect.mapiternext(0x195)
    /usr/local/go/src/runtime/map.go:1346 +0x19 fp=0xc000032fd0 sp=0xc000032fb8 pc=0x462899
reflect.(*MapIter).Next(0xc001183de0)
    /usr/local/go/src/reflect/value.go:1631 +0x99 fp=0xc000032ff0 sp=0xc000032fd0 pc=0x49ba59
internal/fmtsort.Sort({0x19c0e20, 0xc001183d78, 0xc00106e620})
    /usr/local/go/src/internal/fmtsort/sort.go:63 +0x159 fp=0xc0000330b0 sp=0xc000032ff0 pc=0x4b0b99
fmt.(*pp).printValue(0xc0006fa4e0, {0x19c0e20, 0xc001183d78, 0x4d41c5}, 0x76, 0x1)
    /usr/local/go/src/fmt/print.go:769 +0x445 fp=0xc000033298 sp=0xc0000330b0 pc=0x4ecde5
fmt.(*pp).printValue(0xc0006fa4e0, {0x183e020, 0xc001c51e90, 0x1210}, 0x76, 0x0)
    /usr/local/go/src/fmt/print.go:865 +0x17ae fp=0xc000033480 sp=0xc000033298 pc=0x4ee14e
fmt.(*pp).printArg(0xc0006fa4e0, {0x183e020, 0xc001c51e90}, 0x76)
    /usr/local/go/src/fmt/print.go:712 +0x74c fp=0xc000033520 sp=0xc000033480 pc=0x4ec90c
fmt.(*pp).doPrintf(0xc0006fa4e0, {0x1be2a59, 0x10}, {0xc0000337b0, 0x47301e, 0xc0000336c8})
    /usr/local/go/src/fmt/print.go:1026 +0x288 fp=0xc000033618 sp=0xc000033520 pc=0x4ef108
fmt.Sprintf({0x1be2a59, 0x10}, {0xc0000337b0, 0x1, 0x1})
    /usr/local/go/src/fmt/print.go:219 +0x59 fp=0xc000033670 sp=0xc000033618 pc=0x4e9519
github.com/sirupsen/logrus.(*Entry).Logf(0xc0004915e0, 0x5, {0x1be2a59, 0xc0000336f8}, {0xc0000337b0, 0x18, 0x18f6ac0})
    /app/vendor/github.com/sirupsen/logrus/entry.go:338 +0x49 fp=0xc0000336b8 sp=0xc000033670 pc=0x8f72a9
github.com/sirupsen/logrus.(*Logger).Logf(0xc000227180, 0x5, {0x1be2a59, 0x10}, {0xc0000337b0, 0x1, 0x1})
    /app/vendor/github.com/sirupsen/logrus/logger.go:151 +0x85 fp=0xc000033708 sp=0xc0000336b8 pc=0x8f92a5
github.com/sirupsen/logrus.(*Logger).Debugf(...)
    /app/vendor/github.com/sirupsen/logrus/logger.go:161
github.com/sirupsen/logrus.Debugf(...)
    /app/vendor/github.com/sirupsen/logrus/exported.go:189
github.com/netobserv/flowlogs-pipeline/pkg/pipeline/ingest.(*ingestKafka).processLogLines(0xc00018a240, 0x5)
    /app/pkg/pipeline/ingest/ingest_kafka.go:100 +0x2a5 fp=0xc000033838 sp=0xc000033708 pc=0xa73945

I know @jpinsonneau fixed a similar one recently (in loki write), but this is another one (I do have Julien's patch while seeing this). As far as I can tell, this one happened between a map iteration (in kafka ingest / logger) and probably a map write in enrich. We should review all map writes that we do and secure them (basically, do a copy on write). Not sure if there is such a thing out of the box in golang but it would be nice to have immutable maps, explicitly needing to copy to a mutable one before writing.

cc @mariomac @eranra

eranra commented 2 years ago

@KalmanMeth might this be connected to what you suggested on the other stages that are not coping input entries ???

eranra commented 2 years ago

@jotak @mariomac is that something that will be fixed once @mariomac merges the "complete" using channels approach ???