Open Obihoernchen opened 4 years ago
@Obihoernchen thanks for this. Pasting some of the log files here:
Mär 13 21:49:56 node1 influxd[21505]: ts=2020-03-13T20:49:56.443038Z lvl=info msg="Compacting file" log_id=0LLW8k2l000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0LXQQ6rW000 op_name=tsm1_compact_group tsm1_index=3 tsm1_file=/var/lib/influxdb/data/esmon_database/autogen/429/000019072-000000004.tsm
Mär 13 21:50:11 node1 influxd[21505]: ts=2020-03-13T20:50:11.466226Z lvl=info msg="Cache snapshot (start)" log_id=0LLW8k2l000 engine=tsm1 trace_id=0LXQR1Ul000 op_name=tsm1_cache_snapshot op_event=start
Mär 13 21:50:13 node1 influxd[21505]: ts=2020-03-13T20:50:13.555685Z lvl=info msg="Snapshot for path written" log_id=0LLW8k2l000 engine=tsm1 trace_id=0LXQR1Ul000 op_name=tsm1_cache_snapshot path=/var/lib/influxdb/data/esmon_database/autogen/429 duration=2103.821ms
Mär 13 21:50:13 node1 influxd[21505]: ts=2020-03-13T20:50:13.555747Z lvl=info msg="Cache snapshot (end)" log_id=0LLW8k2l000 engine=tsm1 trace_id=0LXQR1Ul000 op_name=tsm1_cache_snapshot op_event=end op_elapsed=2103.807ms
Mär 13 21:50:14 node1 influxd[21505]: fatal error: concurrent map read and map write
Mär 13 21:50:14 node1 influxd[21505]: goroutine 8000570258 [running]:
Mär 13 21:50:14 node1 influxd[21505]: runtime.throw(0x11078372, 0x21)
Mär 13 21:50:14 node1 influxd[21505]: /usr/local/go/src/runtime/panic.go:617 +0x5c fp=0xc0c111e8c8 sp=0xc0c111e888 pc=0x1003044c
Mär 13 21:50:14 node1 influxd[21505]: runtime.mapaccess1_faststr(0x10e26aa0, 0xc2f19e4300, 0xc2a3745280, 0x80, 0x1)
Mär 13 21:50:14 node1 influxd[21505]: /usr/local/go/src/runtime/map_faststr.go:21 +0x4e8 fp=0xc0c111e948 sp=0xc0c111e8c8 pc=0x10014128
Mär 13 21:50:14 node1 influxd[21505]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(*partition).entry(0xc2025f7340, 0xc2a3745280, 0x80, 0x80, 0x0)
Mär 13 21:50:14 node1 influxd[21505]: /home/build/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/ring.go:229 +0x7c fp=0xc0c111e998 sp=0xc0c111e948 pc=0x10cbbf2c
Mär 13 21:50:14 node1 influxd[21505]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(*ring).entry(0xc2025f7160, 0xc2a3745280, 0x80, 0x80, 0x0)
Mär 13 21:50:14 node1 influxd[21505]: /home/build/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/ring.go:93 +0x84 fp=0xc0c111e9e0 sp=0xc0c111e998 pc=0x10cbb3b4
Mär 13 21:50:14 node1 influxd[21505]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Cache).Values(0xc180513e40, 0xc2a3745280, 0x80, 0x80, 0x0, 0xc25e4dd320, 0x11ca5c00)
Mär 13 21:50:14 node1 influxd[21505]: /home/build/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/cache.go:559 +0x88 fp=0xc0c111ead0 sp=0xc0c111e9e0 pc=0x10c5b3f8
Mär 13 21:50:14 node1 influxd[21505]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).buildFloatCursor(0xc0649f2280, 0x11cb25c0, 0xc23785c960, 0xc3837e4300, 0x13, 0xc0c3186800, 0x77, 0xc096941325, 0x5, 0x11ca8c80, ...)
Mär 13 21:50:14 node1 influxd[21505]: /home/build/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.gen.go:18 +0x164 fp=0xc0c111ed70 sp=0xc0c111ead0 pc=0x10c7c6f4
Mär 13 21:50:14 node1 influxd[21505]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).buildCursor(0xc0649f2280, 0x11cb25c0, 0xc23785c960, 0xc3837e4300, 0x13, 0xc0c3186800, 0x77, 0xc2516f4000, 0x6, 0x6, ...)
Mär 13 21:50:14 node1 influxd[21505]: /home/build/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:2892 +0x968 fp=0xc0c111ef80 sp=0xc0c111ed70 pc=0x10c923e8
Mär 13 21:50:14 node1 influxd[21505]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createVarRefSeriesIterator(0xc0649f2280, 0x11cb25c0, 0xc23785c960, 0xc275ba1880, 0xc3837e4300, 0x13, 0xc0c3186800, 0x77, 0xc1374e6870, 0x0, ...)
Mär 13 21:50:14 node1 influxd[21505]: /home/build/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:2665 +0x2400 fp=0xc0c111fa38 sp=0xc0c111ef80 pc=0x10c91940
Mär 13 21:50:14 node1 influxd[21505]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createTagSetGroupIterators(0xc0649f2280, 0x11cb25c0, 0xc23785c960, 0xc275ba1880, 0xc3837e4300, 0x13, 0xc0cfb66700, 0x1, 0x10, 0xc1374e6870, ...)
Mär 13 21:50:14 node1 influxd[21505]: /home/build/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:2620 +0x17c fp=0xc0c111fc80 sp=0xc0c111fa38 pc=0x10c8f1fc
Mär 13 21:50:14 node1 influxd[21505]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createTagSetIterators.func1(0xc062e98a40, 0xc0649f2280, 0x11cb25c0, 0xc23785c960, 0xc275ba1880, 0xc3837e4300, 0x13, 0xc10949c000, 0x3c, 0x3c, ...)
Mär 13 21:50:14 node1 influxd[21505]: /home/build/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:2580 +0x174 fp=0xc0c111ff58 sp=0xc0c111fc80 pc=0x10cd5b84
Mär 13 21:50:14 node1 influxd[21505]: runtime.goexit()
Mär 13 21:50:14 node1 influxd[21505]: /usr/local/go/src/runtime/asm_ppc64x.s:857 +0x4 fp=0xc0c111ff58 sp=0xc0c111ff58 pc=0x10061384
Mär 13 21:50:14 node1 influxd[21505]: created by github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createTagSetIterators
Mär 13 21:50:14 node1 influxd[21505]: /home/build/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:2578 +0x34c
Steps to reproduce: List the minimal actions needed to reproduce the behavior.
Expected behavior: No crash
Actual behavior: InfluxDB crashes about once a week (but recovers automatically).
Environment info: There are two influxdb instances running on two identical ppc64le nodes. Collectd sends data from multiple hosts via the
write_opentsdb
plugin to both instances.Linux 4.14.0-115.14.1.el7a.ppc64le ppc64le
,RHEL 7.6 for ppc64le
InfluxDB v1.7.10 (git: heads/v1.7.10 f46f63d4e2d9684a2dd716594ab609ccd32f0a5b)
1.12.17
and:python build.py --package --release --clean --update
reporting-disabled = true
[data] index-version = "tsi1" query-log-enabled = false series-id-set-cache-size = 100
[http] auth-enabled = true log-enabled = false
[[opentsdb]] enabled = true bind-address = ":4242" database = "esmon_database"
[continuous_queries] log-enabled = false