Open tomwimmenhove opened 2 years ago
Hey I gave this a shot and wasn't able to reproduce it locally. I tried CTRL+C'ing a curl request as well as various levels of kill, and the influxdb node stayed up the whole time. If you can provide any more information to make reproducing easier, that would be great!
I'm using a client written in C# to communicate with influx. The way a query is cancelled is by passing a CancellationToken
to HttpClient.SendAsync()
. I'm not sure if this is equivalent to sending a SIGINT
to curl (although it seems like it should be?)
The database itself is over one TiB and the moment I cancel the query definitely has an effect on the outcome. Maybe because the database being so large, the chance that I cancel the query during some critical part of the process is higher.
I was hoping that the stack trace would give you some information...
Good to know about the DB being large, also I noticed your path is /mnt/NFS/InfluxData/engine
. What type of NFS mount are you using?
Good to know about the DB being large, also I noticed your path is
/mnt/NFS/InfluxData/engine
. What type of NFS mount are you using?
I'm not entirely sure what you mean with 'type', the options in /etc/fstab are "defaults,rsize=65536,wsize=65536,nosuid" and it's connected over 10GBit with a few tens TB NAS consisting of only SSDs.
Steps to reproduce: List the minimal actions needed to reproduce the behavior.
Expected behavior: The query to simlply be aborted
Actual behavior: Internal errors/panics and even a SegFault.
Environment info:
Logs: A logfile from the time that the service actually crashed completely (Process exited) has been atttached. It can't be pasted, since the logfile is multiple megabytes syslog.ifl.gz Here's the 'head' of the log file:
Config: { "assets-path": "", "bolt-path": "/root/.influxdbv2/influxd.bolt", "e2e-testing": false, "engine-path": "/mnt/NFS/InfluxData/engine", "feature-flags": null, "flux-log-enabled": false, "hardening-enabled": false, "http-bind-address": ":8086", "http-idle-timeout": 180000000000, "http-read-header-timeout": 10000000000, "http-read-timeout": 0, "http-write-timeout": 0, "influxql-max-select-buckets": 0, "influxql-max-select-point": 0, "influxql-max-select-series": 0, "log-level": "info", "metrics-disabled": false, "nats-max-payload-bytes": 0, "nats-port": 0, "no-tasks": false, "pprof-disabled": false, "query-concurrency": 1024, "query-initial-memory-bytes": 0, "query-max-memory-bytes": 2147483648, "query-memory-bytes": 0, "query-queue-size": 1024, "reporting-disabled": false, "secret-store": "bolt", "session-length": 60, "session-renew-disabled": false, "sqlite-path": "/root/.influxdbv2/influxd.sqlite", "storage-cache-max-memory-size": 1073741824, "storage-cache-snapshot-memory-size": 26214400, "storage-cache-snapshot-write-cold-duration": "10m0s", "storage-compact-full-write-cold-duration": "4h0m0s", "storage-compact-throughput-burst": 50331648, "storage-max-concurrent-compactions": 0, "storage-max-index-log-file-size": 1048576, "storage-no-validate-field-size": false, "storage-retention-check-interval": "30m0s", "storage-series-file-max-concurrent-snapshot-compactions": 0, "storage-series-id-set-cache-size": 0, "storage-shard-precreator-advance-period": "30m0s", "storage-shard-precreator-check-interval": "10m0s", "storage-tsm-use-madv-willneed": false, "storage-validate-keys": false, "storage-wal-fsync-delay": "0s", "storage-wal-max-concurrent-writes": 0, "storage-wal-max-write-delay": 600000000000, "storage-write-timeout": 10000000000, "store": "disk", "testing-always-allow-setup": false, "tls-cert": "", "tls-key": "", "tls-min-version": "1.2", "tls-strict-ciphers": false, "tracing-type": "", "ui-disabled": false, "vault-addr": "", "vault-cacert": "", "vault-capath": "", "vault-client-cert": "", "vault-client-key": "", "vault-client-timeout": 0, "vault-max-retries": 0, "vault-skip-verify": false, "vault-tls-server-name": "", "vault-token": "" }