influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
28.73k stars 3.54k forks source link

fatal error: concurrent map read and map write running rhel 8 on powerpc #21462

Open cd-gumo opened 3 years ago

cd-gumo commented 3 years ago

A customer uses influxdb on two seperate PowerPC machines running RHEL 8.

Viewing the logs we discovered concurrent map read and map write causing panics. Sometimes these panics leave influxdb in a corrupted state unable to start again.

This problem exists since the beginning of the hosts setup. It was running InfluxDB 1.7.10 when we first viewed the logs. An upgrade to 1.8.4 didn't solve the issue.

Environment info:

Logs: fatal error: concurrent map read and map write goroutine 18708922 [running]: runtime.throw(0x11078372, 0x21) /usr/local/go/src/runtime/panic.go:617 +0x5c fp=0xc484513600 sp=0xc4845135c0 pc=0x1003044c runtime.mapaccess1_faststr(0x10e26aa0, 0xc069cf70e0, 0xc4d3b0c480, 0xba, 0x10cbb420) /usr/local/go/src/runtime/map_faststr.go:21 +0x4e8 fp=0xc484513680 sp=0xc484513600 pc=0x10014128 github.com/influxdata/influxdb/tsdb/engine/tsm1.(partition).write(0xc127936380, 0xc4d3b0c480, 0xba, 0xc0, 0xc48b0ee0d0, 0x1, 0x1, 0x0, 0x0, 0x0) /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/ring.go:239 +0x88 fp=0xc4845136e0 sp=0xc484513680 pc=0x10cbc008 github.com/influxdata/influxdb/tsdb/engine/tsm1.(ring).write(0xc08590d860, 0xc4d3b0c480, 0xba, 0xc0, 0xc48b0ee0d0, 0x1, 0x1, 0x101, 0x0, 0x0) /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/ring.go:100 +0x9c fp=0xc484513750 sp=0xc4845136e0 pc=0x10cbb47c github.com/influxdata/influxdb/tsdb/engine/tsm1.(Cache).WriteMulti(0xc3845ee580, 0xc0ece3dd10, 0xc24a194780, 0xa4) /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/cache.go:343 +0x290 fp=0xc4845138e0 sp=0xc484513750 pc=0x10c5a330 github.com/influxdata/influxdb/tsdb/engine/tsm1.(Engine).WritePoints(0xc24a194780, 0xc48209c000, 0xf7, 0xf7, 0x0, 0x0) /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1354 +0xbb8 fp=0xc484513ab8 sp=0xc4845138e0 pc=0x10c86078 github.com/influxdata/influxdb/tsdb.(Shard).WritePoints(0xc04bf34d80, 0xc48209c000, 0xf7, 0xf7, 0x0, 0x0) /root/go/src/github.com/influxdata/influxdb/tsdb/shard.go:525 +0x1d8 fp=0xc484513b78 sp=0xc484513ab8 pc=0x105a6b18 github.com/influxdata/influxdb/tsdb.(Store).WriteToShard(0xc000192200, 0x593, 0xc48209c000, 0xf7, 0xf7, 0x0, 0x0) /root/go/src/github.com/influxdata/influxdb/tsdb/store.go:1410 +0x1f0 fp=0xc484513c00 sp=0xc484513b78 pc=0x105ba0a0 github.com/influxdata/influxdb/coordinator.(PointsWriter).writeToShard(0xc00000cc00, 0xc4820a3820, 0xc43f91866f, 0x10, 0xc0004e8978, 0x6, 0xc48209c000, 0xf7, 0xf7, 0xe, ...) /root/go/src/github.com/influxdata/influxdb/coordinator/points_writer.go:370 +0x88 fp=0xc484513e98 sp=0xc484513c00 pc=0x1065dc38 github.com/influxdata/influxdb/coordinator.(PointsWriter).WritePointsPrivileged.func1(0xc00000cc00, 0xc4c993ede0, 0xc4820a3820, 0xc43f91866f, 0x10, 0xc0004e8978, 0x6, 0xc48209c000, 0xf7, 0xf7) /root/go/src/github.com/influxdata/influxdb/coordinator/points_writer.go:312 +0x70 fp=0xc484513f70 sp=0xc484513e98 pc=0x1066da20 runtime.goexit() /usr/local/go/src/runtime/asm_ppc64x.s:857 +0x4 fp=0xc484513f70 sp=0xc484513f70 pc=0x10061384 created by github.com/influxdata/influxdb/coordinator.(PointsWriter).WritePointsPrivileged /root/go/src/github.com/influxdata/influxdb/coordinator/points_writer.go:311 +0x240 goroutine 1 [chan receive, 195 minutes]: main.(Main).Run(0xc00065df38, 0xc00003c190, 0x2, 0x2, 0x12c74720, 0x1c464371) /root/go/src/github.com/influxdata/influxdb/cmd/influxd/main.go:90 +0x29c main.main() /root/go/src/github.com/influxdata/influxdb/cmd/influxd/main.go:45 +0x154 goroutine 5 [syscall, 196 minutes]: os/signal.signal_recv(0x0) /usr/local/go/src/runtime/sigqueue.go:139 +0xf8 os/signal.loop() /usr/local/go/src/os/signal/signal_unix.go:23 +0x24 created by os/signal.init.0 /usr/local/go/src/os/signal/signal_unix.go:29 +0x3c goroutine 21 [select]: github.com/influxdata/influxdb/vendor/go.opencensus.io/stats/view.(worker).start(0xc0001dcb90) /root/go/src/github.com/influxdata/influxdb/vendor/go.opencensus.io/stats/view/worker.go:154 +0xd8 created by github.com/influxdata/influxdb/vendor/go.opencensus.io/stats/view.init.0 /root/go/src/github.com/influxdata/influxdb/vendor/go.opencensus.io/stats/view/worker.go:32 +0x64 goroutine 73 [IO wait, 196 minutes]: internal/poll.runtime_pollWait(0x7fff9c198fd0, 0x72, 0x0) /usr/local/go/src/runtime/netpoll.go:182 +0x54 internal/poll.(pollDesc).wait(0xc00000cc98, 0x72, 0x0, 0x0, 0x1104cc09) /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0xac internal/poll.(pollDesc).waitRead(...) /usr/local/go/src/internal/poll/fd_poll_runtime.go:92 internal/poll.(FD).Accept(0xc00000cc80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) /usr/local/go/src/internal/poll/fd_unix.go:384 +0x1c4 net.(netFD).accept(0xc00000cc80, 0x1007e224, 0x0, 0x0) /usr/local/go/src/net/fd_unix.go:238 +0x34 net.(TCPListener).accept(0xc0001ad750, 0x0, 0xc0004dcb40, 0x0) /usr/local/go/src/net/tcpsock_posix.go:139 +0x30 net.(TCPListener).Accept(0xc0001ad750, 0x0, 0x0, 0x0, 0x0) /usr/local/go/src/net/tcpsock.go:260 +0x50 github.com/influxdata/influxdb/tcp.(Mux).Serve(0xc0004dcb40, 0x11cab260, 0xc0001ad750, 0x11055e23, 0xe) /root/go/src/github.com/influxdata/influxdb/tcp/mux.go:75 +0x78 created by github.com/influxdata/influxdb/cmd/influxd/run.(Server).Open /root/go/src/github.com/influxdata/influxdb/cmd/influxd/run/server.go:387 +0x2ac goroutine 13379 [select]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(Engine).compactCache(0xc24a194000) /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1926 +0x114 github.com/influxdata/influxdb/tsdb/engine/tsm1.(Engine).enableSnapshotCompactions.func1(0xc3789122f0, 0xc24a194000) /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:505 +0x54 created by github.com/influxdata/influxdb/tsdb/engine/tsm1.(Engine).enableSnapshotCompactions /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:505 +0x13c goroutine 13634 [select, 35 minutes]: github.com/influxdata/influxdb/services/subscriber.(Service).waitForMetaUpdates(0xc0002d8900) /root/go/src/github.com/influxdata/influxdb/services/subscriber/service.go:165 +0xb4 github.com/influxdata/influxdb/services/subscriber.(Service).Open.func2(0xc0002d8900) /root/go/src/github.com/influxdata/influxdb/services/subscriber/service.go:102 +0x5c created by github.com/influxdata/influxdb/services/subscriber.(Service).Open /root/go/src/github.com/influxdata/influxdb/services/subscriber/service.go:100 +0x178 goroutine 18705268 [IO wait]: internal/poll.runtime_pollWait(0x7ffebe28c828, 0x72, 0xffffffffffffffff) /usr/local/go/src/runtime/netpoll.go:182 +0x54 internal/poll.(pollDesc).wait(0xc5ae244118, 0x72, 0x0, 0x1, 0xffffffffffffffff) /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0xac internal/poll.(pollDesc).waitRead(...) /usr/local/go/src/internal/poll/fd_poll_runtime.go:92 internal/poll.(FD).Read(0xc5ae244100, 0xc06e700341, 0x1, 0x1, 0x0, 0x0, 0x0) /usr/local/go/src/internal/poll/fd_unix.go:169 +0x194 net.(netFD).Read(0xc5ae244100, 0xc06e700341, 0x1, 0x1, 0x1f, 0x1f, 0xc577dc5f00) /usr/local/go/src/net/fd_unix.go:202 +0x48 net.(conn).Read(0xc5d5958010, 0xc06e700341, 0x1, 0x1, 0x0, 0x0, 0x0) /usr/local/go/src/net/net.go:177 +0x6c net/http.(connReader).backgroundRead(0xc06e700330) /usr/local/go/src/net/http/server.go:677 +0x5c created by net/http.(connReader).startBackgroundRead /usr/local/go/src/net/http/server.go:673 +0xdc goroutine 11650319 [select]: ...

eist76 commented 3 years ago

Hello,

same FATAL ERROR and influxdb crash for me (every 3-4 days) running InfluxDB 1.8.4 on AIX 7.2 (Package source: https://www.power-devops.com/influxdb):

fatal error: concurrent map read and map write

goroutine 3703670186 [running]: runtime.throw(0x100e16919, 0x21) /opt/freeware/lib/golang/src/runtime/panic.go:1116 +0x68 fp=0xa0001002c8f3488 sp=0xa0001002c8f3448 pc=0x10003a008 runtime.mapaccess1_faststr(0x11014cff0, 0xa00010081b0cf60, 0xa0001004bd66690, 0x6d, 0xa0001000161eb90) ...

cd-gumo commented 3 years ago

The service now ran into an corrupted state unable to start again. This already happened with 1.7.10 , now also with 1.8.4 . Usually a reboot helps solving the problem temporarily.

influxd[2696]: ts=2021-05-17T05:01:02.019235Z lvl=info msg="Opened shard" log_id=0UAAxGwW000 service=store trace_id=0UAAxGz0000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/786 duration=15047.791ms influxd[2696]: ts=2021-05-17T05:01:02.050989Z lvl=info msg="index opened with 8 partitions" log_id=0UAAxGwW000 index=tsi influxd[2696]: ts=2021-05-17T05:01:02.167406Z lvl=info msg="Opened shard" log_id=0UAAxGwW000 service=store trace_id=0UAAxGz0000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/1451 duration=15900.304ms influxd[2696]: ts=2021-05-17T05:01:02.215958Z lvl=info msg="Opened file" log_id=0UAAxGwW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/1715/000001032-000000003.tsm id=1 duration=2546.934ms influxd[2696]: ts=2021-05-17T05:01:02.219511Z lvl=info msg="Opened shard" log_id=0UAAxGwW000 service=store trace_id=0UAAxGz0000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/792 duration=15041.146ms influxd[2696]: ts=2021-05-17T05:01:02.226840Z lvl=info msg="Opened file" log_id=0UAAxGwW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/1806/000001070-000000003.tsm id=1 duration=2453.310ms influxd[2696]: ts=2021-05-17T05:01:02.252111Z lvl=info msg="Opened file" log_id=0UAAxGwW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/1712/000000826-000000003.tsm id=1 duration=2431.974ms influxd[2696]: ts=2021-05-17T05:01:02.281465Z lvl=info msg="index opened with 8 partitions" log_id=0UAAxGwW000 index=tsi influxd[2696]: ts=2021-05-17T05:01:02.429691Z lvl=info msg="Opened file" log_id=0UAAxGwW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/1797/000001038-000000002.tsm id=0 duration=5552.805ms influxd[2696]: ts=2021-05-17T05:01:02.429824Z lvl=info msg="Opened shard" log_id=0UAAxGwW000 service=store trace_id=0UAAxGz0000 op_name=tsdb_open index_version=tsi1 path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/1797 duration=7100.695ms influxd[2696]: ts=2021-05-17T05:01:02.630800Z lvl=info msg="Opened file" log_id=0UAAxGwW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/1502/000001078-000000003.tsm id=1 duration=2532.643ms influxd[2696]: ts=2021-05-17T05:01:02.828240Z lvl=info msg="Opened file" log_id=0UAAxGwW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/980/000000963-000000003.tsm id=1 duration=2413.143ms influxd[2696]: ts=2021-05-17T05:01:02.886602Z lvl=info msg="Opened file" log_id=0UAAxGwW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/1800/000001095-000000002.tsm id=0 duration=5703.898ms influxd[2696]: ts=2021-05-17T05:01:02.886921Z lvl=info msg="Opened shard" log_id=0UAAxGwW000 service=store trace_id=0UAAxGz0000 op_name=tsdb_open index_version=tsi1 path=/var/lib/influxdb/data/nimon_v56_aix_db/r_14_d/1800 duration=7278.544ms influxd[2696]: unexpected fault address 0x7f357ad70000 influxd[2696]: fatal error: fault influxd[2696]: [signal SIGBUS: bus error code=0x2 addr=0x7f357ad70000 pc=0x10c919b0] influxd[2696]: goroutine 202527 [running]: influxd[2696]: runtime.throw(0x110cd6b9, 0x5) influxd[2696]: /usr/local/go/src/runtime/panic.go:1116 +0x5c fp=0xc09cd79a68 sp=0xc09cd79a28 pc=0x100366dc influxd[2696]: runtime.sigpanic() influxd[2696]: /usr/local/go/src/runtime/signal_unix.go:692 +0x450 fp=0xc09cd79aa8 sp=0xc09cd79a68 pc=0x1004f270 influxd[2696]: github.com/influxdata/influxdb/tsdb/index/tsi1.(LogEntry).UnmarshalBinary(0xc09cd79bf8, 0x7f357ad70000, 0x1a8bc, 0x1a8bc, 0x1a8bc, 0x1a8bc) influxd[2696]: /root/go/src/github.com/influxdata/influxdb/tsdb/index/tsi1/log_file.go:1128 +0x30 fp=0xc09cd79b38 sp=0xc09cd79ac8 pc=0x10c919b0 influxd[2696]: github.com/influxdata/influxdb/tsdb/index/tsi1.(LogFile).open(0xc002816b40, 0x4b, 0xc0cc99b320) influxd[2696]: /root/go/src/github.com/influxdata/influxdb/tsdb/index/tsi1/log_file.go:154 +0x2f0 fp=0xc09cd79ca0 sp=0xc09cd79b38 pc=0x10c8abe0 influxd[2696]: github.com/influxdata/influxdb/tsdb/index/tsi1.(LogFile).Open(0xc002816b40, 0xc0cc9ce9b0, 0x4b) influxd[2696]: /root/go/src/github.com/influxdata/influxdb/tsdb/index/tsi1/log_file.go:110 +0x2c fp=0xc09cd79ce8 sp=0xc09cd79ca0 pc=0x10c8a88c influxd[2696]: github.com/influxdata/influxdb/tsdb/index/tsi1.(Partition).openLogFile(0xc0f1584b00, 0xc0cc9ce9b0, 0x4b, 0xc0cc9ce9b0, 0x4b, 0x0) influxd[2696]: /root/go/src/github.com/influxdata/influxdb/tsdb/index/tsi1/partition.go:255 +0x64 fp=0xc09cd79d30 sp=0xc09cd79ce8 pc=0x10c97ba4 influxd[2696]: github.com/influxdata/influxdb/tsdb/index/tsi1.(Partition).Open(0xc0f1584b00, 0x0, 0x0) influxd[2696]: /root/go/src/github.com/influxdata/influxdb/tsdb/index/tsi1/partition.go:194 +0x648 fp=0xc09cd79f50 sp=0xc09cd79d30 pc=0x10c975b8 influxd[2696]: github.com/influxdata/influxdb/tsdb/index/tsi1.(Index).Open.func1(0xc10417e348, 0x8, 0xc000f2a690, 0xc026a22cc0, 0x2) influxd[2696]: /root/go/src/github.com/influxdata/influxdb/tsdb/index/tsi1/index.go:285 +0x40 fp=0xc09cd79f98 sp=0xc09cd79f50 pc=0x10ca5570 influxd[2696]: runtime.goexit() ...

Official support for ppc64le mentioned in #17228 would be great.

aklyachkin commented 3 years ago

@v0rce did you try to run it on RHEL7/ppc64le?

cd-gumo commented 3 years ago

@aklyachkin unfortunately not. These are new hosts. RHEL 7 is not an option due to the remaining support period.

aklyachkin commented 3 years ago

@v0rce can you check it with 1.8.6, if still fails?

cd-gumo commented 3 years ago

Yes, unfortunately that didn't help. Logs from crash here:

influxd[15246]: fatal error: concurrent map read and map write influxd[15246]: goroutine 1537385 [running]: influxd[15246]: runtime.throw(0x111095f2, 0x21) influxd[15246]: /usr/local/go/src/runtime/panic.go:1116 +0x5c fp=0xc0b9799490 sp=0xc0b9799450 pc=0x10036dbc influxd[15246]: runtime.mapaccess1_faststr(0x10eaec60, 0xc0b8be4c60, 0xc0e08ae240, 0xb6, 0xc0e1a2f508) influxd[15246]: /usr/local/go/src/runtime/map_faststr.go:21 +0x480 fp=0xc0b9799510 sp=0xc0b9799490 pc=0x10014390 influxd[15246]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(partition).write(0xc096e76f60, 0xc0e08ae240, 0xb6, 0xc0, 0xc0d357eba0, 0x1, 0x1, 0x0, 0x0, 0x0) influxd[15246]: /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/ring.go:239 +0x94 fp=0xc0b9799590 sp=0xc0b9799510 pc=0x10d3d424 influxd[15246]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(ring).write(0xc096e76e60, 0xc0e08ae240, 0xb6, 0xc0, 0xc0d357eba0, 0x1, 0x1, 0x1, 0x0, 0x0) influxd[15246]: /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/ring.go:100 +0xa0 fp=0xc0b9799600 sp=0xc0b9799590 pc=0x10d3c7e0 influxd[15246]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(Cache).WriteMulti(0xc06980ba20, 0xc0aef8c5d0, 0xc0d44fc370, 0xaa) influxd[15246]: /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/cache.go:343 +0x2a0 fp=0xc0b9799780 sp=0xc0b9799600 pc=0x10cdca50 influxd[15246]: github.com/influxdata/influxdb/tsdb/engine/tsm1.(Engine).WritePointsWithContext(0xc0c01e1400, 0x11f6c820, 0xc0866df320, 0xc0c9833000, 0x6fe, 0x6fe, 0x0, 0x0) influxd[15246]: /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1408 +0xb58 fp=0xc0b9799968 sp=0xc0b9799780 pc=0x10d07dd8 influxd[15246]: github.com/influxdata/influxdb/tsdb.(Shard).WritePointsWithContext(0xc0e0037440, 0x11f6c820, 0xc0866df320, 0xc0c9833000, 0x6fe, 0x6fe, 0x0, 0x0) influxd[15246]: /root/go/src/github.com/influxdata/influxdb/tsdb/shard.go:553 +0x478 fp=0xc0b9799a30 sp=0xc0b9799968 pc=0x105b9fc8 influxd[15246]: github.com/influxdata/influxdb/tsdb.(Store).WriteToShardWithContext(0xc000611a00, 0x11f6c820, 0xc0866df320, 0xb7c, 0xc0c9833000, 0x6fe, 0x6fe, 0x0, 0x0) influxd[15246]: /root/go/src/github.com/influxdata/influxdb/tsdb/store.go:1453 +0x210 fp=0xc0b9799ae8 sp=0xc0b9799a30 pc=0x105cd690 influxd[15246]: github.com/influxdata/influxdb/coordinator.(PointsWriter).writeToShardWithContext.func1(0x45574f503d657275, 0x7473756c632c3952) influxd[15246]: /root/go/src/github.com/influxdata/influxdb/coordinator/points_writer.go:420 +0x154 fp=0xc0b9799b80 sp=0xc0b9799ae8 pc=0x10694124 influxd[15246]: github.com/influxdata/influxdb/coordinator.(PointsWriter).writeToShardWithContext(0xc00000c980, 0x11f6c820, 0xc0866df320, 0xc0ca726de0, 0xc0b48e62af, 0x10, 0xc00053c8f0, 0x6, 0xc0c9833000, 0x6fe, ...) influxd[15246]: /root/go/src/github.com/influxdata/influxdb/coordinator/points_writer.go:432 +0xbc fp=0xc0b9799e68 sp=0xc0b9799b80 pc=0x1068519c influxd[15246]: github.com/influxdata/influxdb/coordinator.(PointsWriter).WritePointsPrivilegedWithContext.func1(0xc00000c980, 0xc0bfe2c540, 0x11f6c820, 0xc0866df320, 0xc0ca726de0, 0xc0b48e62af, 0x10, 0xc00053c8f0, 0x6, 0xc0c9833000, ...) influxd[15246]: /root/go/src/github.com/influxdata/influxdb/coordinator/points_writer.go:341 +0x138 fp=0xc0b9799f60 sp=0xc0b9799e68 pc=0x10693dc8 influxd[15246]: runtime.goexit() influxd[15246]: /usr/local/go/src/runtime/asm_ppc64x.s:884 +0x4 fp=0xc0b9799f60 sp=0xc0b9799f60 pc=0x1006de94 influxd[15246]: created by github.com/influxdata/influxdb/coordinator.(PointsWriter).WritePointsPrivilegedWithContext influxd[15246]: /root/go/src/github.com/influxdata/influxdb/coordinator/points_writer.go:336 +0x250 influxd[15246]: goroutine 1 [chan receive, 10 minutes]: influxd[15246]: main.(*Main).Run(0xc00083ff20, 0xc00003c190, 0x2, 0x2, 0x0, 0x32) influxd[15246]: /root/go/src/github.com/influxdata/influxdb/cmd/influxd/main.go:90 +0x240 influxd[15246]: main.main() influxd[15246]: /root/go/src/github.com/influxdata/influxdb/cmd/influxd/main.go:45 +0x140 ...

aklyachkin commented 3 years ago

@v0rce how big is your database?

cd-gumo commented 3 years ago

@v0rce how big is your database?

influxdb size is 1.5TB retention time 14 days collecting data via nimon for nearly 300 hosts

aklyachkin commented 3 years ago

it is much bigger than I expected. Something like this database I can't build in my sandbox environment. Maximum that I've got with synthetic tools is 5 GB and till that size it works without any problems.

cd-gumo commented 3 years ago

Customer upgraded to InfluxDB 1.9.2 but the problem still persists.