influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
28.98k stars 3.56k forks source link

Influxdb does not start any longer #21468

Open yareblo opened 3 years ago

yareblo commented 3 years ago

Maybe this is linked to #21467:

Version 2.06 on Ubuntu 18.04 (virtual environment, 6 CPUs, 16GB RAM) ulimit -n set to 65535 via /etc/security/limits.conf

Clean install as service, no special config file or parameters

Starting with "systemctl start influxdb"

Had issue "too many open files" before (see #21467)

Service does not start any longer and is stuck in restart loop:

May 13 14:42:14 h2934423 influxd[1633]: ts=2021-05-13T12:42:14.957698Z lvl=info msg="Opened shard" log_id=0U5RlJ50000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/var/lib/influxdb/engine/data/7761a8601e659405/autogen/34650 duration=38.257ms
May 13 14:42:15 h2934423 influxd[1633]: ts=2021-05-13T12:42:15.422499Z lvl=info msg="Opened shard" log_id=0U5RlJ50000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/var/lib/influxdb/engine/data/7d4981dee9ebc742/autogen/42194 duration=588.498ms
May 13 14:42:15 h2934423 influxd[1633]: ts=2021-05-13T12:42:15.704956Z lvl=info msg="Open store (end)" log_id=0U5RlJ50000 service=storage-engine service=store op_name=tsdb_open op_event=end op_elapsed=69525.295ms
May 13 14:42:15 h2934423 influxd[1633]: ts=2021-05-13T12:42:15.706039Z lvl=info msg="Starting retention policy enforcement service" log_id=0U5RlJ50000 service=retention check_interval=30m
May 13 14:42:15 h2934423 influxd[1633]: ts=2021-05-13T12:42:15.706333Z lvl=info msg="Starting precreation service" log_id=0U5RlJ50000 service=shard-precreation check_interval=10m advance_period=30m
May 13 14:42:15 h2934423 influxd[1633]: ts=2021-05-13T12:42:15.707241Z lvl=info msg="Starting query controller" log_id=0U5RlJ50000 service=storage-reads concurrency_quota=1024 initial_memory_bytes_quota_per_query=9223372036854775807 memory_bytes_quota_per_query=9223372036854775807 max_memory_bytes=0 queue_size=1024
May 13 14:42:15 h2934423 influxd[1633]: ts=2021-05-13T12:42:15.732140Z lvl=info msg="Configuring InfluxQL statement executor (zeros indicate unlimited)." log_id=0U5RlJ50000 max_select_point=0 max_select_series=0 max_select_buckets=0
May 13 14:42:15 h2934423 influxd[1633]: ts=2021-05-13T12:42:15.768969Z lvl=error msg="Failed to start nats streaming server" log_id=0U5RlJ50000 error="nats: no servers available for connection"
May 13 14:42:15 h2934423 influxd[1633]: Error: nats: no servers available for connection
May 13 14:42:15 h2934423 influxd[1633]: See 'influxd -h' for help
May 13 14:42:16 h2934423 systemd[1]: influxdb.service: Main process exited, code=exited, status=1/FAILURE
May 13 14:42:16 h2934423 systemd[1]: influxdb.service: Failed with result 'exit-code'.
May 13 14:42:17 h2934423 systemd[1]: influxdb.service: Service hold-off time over, scheduling restart.
May 13 14:42:17 h2934423 systemd[1]: influxdb.service: Scheduled restart job, restart counter is at 15.

What can I do that the service starts again?

d1ss0nanz commented 3 years ago

For us the issue was that /etc/hosts did not contain a proper entry for localhost.

My guess would be: it defaulted to 127.0.0.1 but our NATS bound to ::1

After adding 127.0.0.1 localhost to /etc/hosts it started.