influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.6k stars 5.57k forks source link

telegraf 1.21.3 crashing: &GoSNMP.Conn is missing #10554

Closed jostrasser closed 2 years ago

jostrasser commented 2 years ago

Relevent telegraf.conf

# Telegraf Configuration
#
# Telegraf is entirely plugin driven. All metrics are gathered from the
# declared inputs, and sent to the declared outputs.
#
# Plugins must be declared in here to be active.
# To deactivate a plugin, comment out the name and any variables.
#
# Use 'telegraf -config telegraf.conf -test' to see what metrics a config
# file would generate.
#
# Environment variables can be used anywhere in this config file, simply prepend
# them with $. For strings the variable must be within quotes (ie, "$STR_VAR"),
# for numbers and booleans they should be plain (ie, $INT_VAR, $BOOL_VAR)

# Global tags can be specified here in key="value" format.
[global_tags]
  # dc = "us-east-1" # will tag all metrics with dc=us-east-1
  # rack = "1a"
  ## Environment variables can be used as tags, and throughout the config file
  # user = "$USER"

# Configuration for telegraf agent
[agent]
  ## Default data collection interval for all inputs
  interval = "60s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true

  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 1000

  ## For failed writes, telegraf will cache metric_buffer_limit metrics for each
  ## output, and will flush this buffer on a successful write. Oldest metrics
  ## are dropped first when this buffer fills.
  ## This buffer only fills when writes fail to output plugin(s).
  metric_buffer_limit = 10000

  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "0s"

  ## Default flushing interval for all outputs. You shouldn't set this below
  ## interval. Maximum flush_interval will be flush_interval + flush_jitter
  flush_interval = "10s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "0s"

  ## By default or when set to "0s", precision will be set to the same
  ## timestamp order as the collection interval, with the maximum being 1s.
  ##   ie, when interval = "10s", precision will be "1s"
  ##       when interval = "250ms", precision will be "1ms"
  ## Precision will NOT be used for service inputs. It is up to each individual
  ## service input to set the timestamp at the appropriate precision.
  ## Valid time units are "ns", "us" (or "µs"), "ms", "s".
  precision = ""

  ## Logging configuration:
  ## Run telegraf with debug log messages.
  debug = false
  ## Run telegraf in quiet mode (error log messages only).
  quiet = false
  ## Specify the log file name. The empty string means to log to stderr.
  logfile = ""

  ## Override default hostname, if empty use os.Hostname()
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = false

###############################################################################
#                            OUTPUT PLUGINS                                   #
###############################################################################

# Configuration for influxdb server to send metrics to
[[outputs.influxdb]]
  ## The full HTTP or UDP URL for your InfluxDB instance.
  ##
  ## Multiple urls can be specified as part of the same cluster,
  ## this means that only ONE of the urls will be written to each interval.
  # urls = ["udp://127.0.0.1:8089"] # UDP endpoint example
  urls = ["http://localhost:8086"] # required
  ## The target database for metrics (telegraf will create it if not exists).
  database = "snmpdb" # required

###############################################################################
#                                INPUT                                        #
###############################################################################

# # Retrieves SNMP values from remote agents
[[inputs.snmp]]
  agents = [ "192.168.5.10:161" ]
#   ## Timeout for each SNMP query.
  timeout = "30s"
#   ## Interval for each SNMP query.
  interval = "15s"
#   ## Number of retries to attempt within timeout.
  retries = 3
#   ## SNMP version, values can be 1, 2, or 3
  version = 2
#
#   ## SNMP community string.
  community = "public"
#
#   ## The GETBULK max-repetitions parameter
  max_repetitions = 10
#
#   ## SNMPv3 auth parameters
#   #sec_name = "myuser"
#   #auth_protocol = "md5"      # Values: "MD5", "SHA", ""
#   #auth_password = "pass"
#   #sec_level = "authNoPriv"   # Values: "noAuthNoPriv", "authNoPriv", "authPriv"
#   #context_name = ""
#   #priv_protocol = ""         # Values: "DES", "AES", ""
#   #priv_password = ""
#
#   ## measurement name
#   name = "QNAP"
#[[inputs.snmp.table]]
#name = "remote_servers"

# QNAP System
[[inputs.snmp.field]]
  name = "CPU_Load"
  oid = "1.3.6.1.4.1.24681.1.3.1.0"
#
[[inputs.snmp.field]]
  name = "CPU1"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196608"
#
[[inputs.snmp.field]]
  name = "CPU2"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196609"
#
[[inputs.snmp.field]]
  name = "CPU3"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196610"
#
[[inputs.snmp.field]]
  name = "CPU4"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196611"
#
[[inputs.snmp.field]]
  name = "CPU5"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196612"
#
[[inputs.snmp.field]]
  name = "CPU6"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196613"
#
[[inputs.snmp.field]]
  name = "CPU7"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196614"
#
[[inputs.snmp.field]]
  name = "CPU8"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196615"
#
[[inputs.snmp.field]]
  name = "Total_RAM"
  oid = "1.3.6.1.4.1.24681.1.3.2.0"
#
[[inputs.snmp.field]]
  name = "Free_RAM"
  oid = "1.3.6.1.4.1.24681.1.3.3.0"
#
[[inputs.snmp.field]]
  name = "Fan1_Speed"
  oid = "1.3.6.1.4.1.24681.1.3.15.1.3.1"
#
[[inputs.snmp.field]]
  name = "Fan2_Speed"
  oid = "1.3.6.1.4.1.24681.1.3.15.1.3.2"
#
[[inputs.snmp.field]]
  name = "CPU_Temp"
  oid = "1.3.6.1.4.1.24681.1.4.1.1.1.1.4.2.0"
#
[[inputs.snmp.field]]
  name = "System_Temp"
  oid = "1.3.6.1.4.1.24681.1.3.6.0"

# Input Hostnames
[[inputs.snmp.field]]
    name = "hostname"
    oid = "1.3.6.1.2.1.1.5.0"
    is_tag = true

# Networking All Interfaces QNAP + SWITCHES
[[inputs.snmp.table]]
    #ifTable,1.3.6.1.2.1.2.2,IF-MIB,OBJECT
    name = "ifTable"
    inherit_tags = [ "hostname" ]
#
#
    [[inputs.snmp.table.field]]
    #ifDescr,1.3.6.1.2.1.2.2.1.2,IF-MIB,OBJECT
      name = "Interface"
      oid = "1.3.6.1.2.1.2.2.1.2"
      is_tag = true
#
    [[inputs.snmp.table.field]]
    #ifInOctets,1.3.6.1.2.1.2.2.1.10,IF-MIB,OBJECT
      name = "RXBytes"
      oid = "1.3.6.1.2.1.2.2.1.10"
#
    [[inputs.snmp.table.field]]
    #ifOutOctets,1.3.6.1.2.1.2.2.1.16,IF-MIB,OBJECT
#
      name = "TXBytes"
      oid = "1.3.6.1.2.1.2.2.1.16"
#
# End Networking Interfaces

# QNAP DISK Section
[[inputs.snmp]]
  agents = [ "192.168.5.10:161"]
#   ## Timeout for each SNMP query.
  timeout = "30s"
  interval = "10m"
#   ## Number of retries to attempt within timeout.
  retries = 3
#   ## SNMP version, values can be 1, 2, or 3
  version = 2
#
#   ## SNMP community string.
  community = "public"
#
#   ## The GETBULK max-repetitions parameter
  max_repetitions = 10

# QNAP HDD Table
  [[inputs.snmp.table]]
    #systemHdTableEX,1.3.6.1.4.1.24681.1.3.11,NAS-MIB,OBJECT-TYPE
    name = "HDDTable"
    inherit_tags = [ "hostname" ]
#
    [[inputs.snmp.table.field]]
    #hdDescrEX,1.3.6.1.4.1.24681.1.3.11.1.2,NAS-MIB,OBJECT-TYPE
#
      name = "HDDDescription"
      oid = "1.3.6.1.4.1.24681.1.3.11.1.2"
      is_tag = true
#
    [[inputs.snmp.table.field]]
    #hdTemperatureEX,1.3.6.1.4.1.24681.1.3.11.1.3,NAS-MIB,OBJECT-TYPE
      name = "Temperature"
      oid = "1.3.6.1.4.1.24681.1.3.11.1.3"
#
    [[inputs.snmp.table.field]]
    #hdStatusEX,1.3.6.1.4.1.24681.1.3.11.1.4,NAS-MIB,OBJECT-TYPE
      name = "Status"
      oid = "1.3.6.1.4.1.24681.1.3.11.1.4"
#
    [[inputs.snmp.table.field]]
    #hdSmartInfoEX,1.3.6.1.4.1.24681.1.3.11.1.7,NAS-MIB,OBJECT-TYPE
      name = "S.M.A.R.T."
      oid = "1.3.6.1.4.1.24681.1.3.11.1.7"

# QNAP DISKPOOL Table
[[inputs.snmp.table]]
    #poolTable,1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2,NAS-MIB,OBJECT-TYPE
    name = "poolTable"
    inherit_tags = [ "hostname" ]
#
    [[inputs.snmp.table.field]]
    #poolID,1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.2,NAS-MIB,OBJECT-TYPE
      name = "poolID"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.2"
      is_tag = true
#
    [[inputs.snmp.table.field]]
    #poolCapacity,1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.3,NAS-MIB,OBJECT-TYPE
      name = "poolCapacity"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.3"
#
    [[inputs.snmp.table.field]]
    #poolFreeSize,1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.4,NAS-MIB,OBJECT-TYPE
      name = "poolFreeSize"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.4"

    [[inputs.snmp.table.field]]
    #poolStatus,1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.5,NAS-MIB,OBJECT-TYPE
      name = "poolStatus"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.5"

# QNAP VOLUME Table   
[[inputs.snmp.table]]
    #volumeTable,1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2,NAS-MIB,OBJECT-TYPE
    name = "volumeTable"
    inherit_tags = [ "hostname" ]
#
    [[inputs.snmp.table.field]]
    #volumeName,1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.8,NAS-MIB,OBJECT-TYPE
      name = "volumeName"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.8"
      is_tag = true
#
    [[inputs.snmp.table.field]]
    #volumeCapacity,1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.3,NAS-MIB,OBJECT-TYPE
      name = "volumeCapacity"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.3"
#
    [[inputs.snmp.table.field]]
    #volumeFreeSize,1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.4,NAS-MIB,OBJECT-TYPE
      name = "volumeFreeSize"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.4"
#
    [[inputs.snmp.table.field]]
    #volumeStatus,1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.5,NAS-MIB,OBJECT-TYPE
      name = "volumeStatus"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.5"

Logs from Telegraf

Feb 1 16:47:15 snmpdb telegraf[122]: 2022-02-01T15:47:15Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect() Feb 1 16:47:15 snmpdb telegraf[122]: 2022-02-01T15:47:15Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect() Feb 1 16:47:30 snmpdb telegraf[122]: 2022-02-01T15:47:30Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect() Feb 1 16:47:30 snmpdb telegraf[122]: 2022-02-01T15:47:30Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect() Feb 1 16:47:45 snmpdb telegraf[122]: 2022-02-01T15:47:45Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect() Feb 1 16:47:45 snmpdb telegraf[122]: 2022-02-01T15:47:45Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect() Feb 1 16:48:00 snmpdb telegraf[122]: 2022-02-01T15:48:00Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect() Feb 1 16:48:00 snmpdb telegraf[122]: 2022-02-01T15:48:00Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect()

System info

Ubuntu 20.04LTS (LXD), Telegraf 1.21.3 affected, 1.20.4 is the last working release

Docker

No response

Steps to reproduce

  1. upgrade to latest telegraf release
  2. Workaround is downgrading to 1.20.4, works immediately.

Expected behavior

Collect snmp metrics

Actual behavior

telegraf stops working with the latest release, unable to collect data via snmp

Additional info

No response

MyaLongmire commented 2 years ago

I believe this is because the gosnmp library got updated. I will look into a work around :)

MyaLongmire commented 2 years ago

@jostrasser can you please attach your mibs? I cannot reproduce this error with the local mibs I have on my machine.

jostrasser commented 2 years ago

Hi @MyaLongmire never installed mibs manually and don't use mibs... are they required now? After downgrading to 1.20.4 all is working fine again. I only have done an upgrade/downgrade, no other changes made.

MyaLongmire commented 2 years ago

No, it is not required, I just thought it would help me reproduce the issue. I will continue to try and get that error to pop up.

jostrasser commented 2 years ago

No, it is not required, I just thought it would help me reproduce the issue. I will continue to try and get that error to pop up.

Thanks for your feedback and support! If you need additional informations or if I can do anything let me know.

reimda commented 2 years ago

Hi @jostrasser, I also tried to reproduce the "performing get on field" errors and wasn't successful. There must be something that we haven't identified yet that is different in the way you run it and the way Mya and I do.

The telegraf.conf you provided isn't complete. It doesn't include the agent section or an output. I wonder if something in those sections is necessary to cause the failure.

I'll include below the complete telegraf.conf I've been using to try to reproduce this. I started with the config you provided, stripped out comments and added agent settings and a simple output to write to stdout.

I ran telegraf from a shell like this:~/Downloads/telegraf-1.21.3/usr/bin/telegraf -config telegraf.conf.10554

With my config I tried changing a few settings that I thought might be involved in the errors:

Could you try my config and see if it still triggers the errors for you? If it doesn't, would you provide a complete config that does trigger the errors?

The logs you provided are also not complete. Before the "performing get on field" error I would expect to see an error that includes the words "setting up connection" which might give more information about why the connection is missing. Could you check your logs for such an error or include your full logs?

Thanks!

[agent]
  interval = "5s"
  flush_interval = "5s"
  omit_hostname = true

[[inputs.snmp]]
  agents = [ "192.168.5.10:161" ]
  timeout = "10s"
  retries = 3
  version = 2
  community = "public"
  max_repetitions = 10

[[inputs.snmp.field]]
  name = "CPU_Load"
  oid = "1.3.6.1.4.1.24681.1.3.1.0"
[[inputs.snmp.field]]
  name = "CPU1"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196608"
[[inputs.snmp.field]]
  name = "CPU2"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196609"
[[inputs.snmp.field]]
  name = "CPU3"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196610"
[[inputs.snmp.field]]
  name = "CPU4"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196611"
[[inputs.snmp.field]]
  name = "CPU5"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196612"
[[inputs.snmp.field]]
  name = "CPU6"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196613"
[[inputs.snmp.field]]
  name = "CPU7"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196614"
[[inputs.snmp.field]]
  name = "CPU8"
  oid = "1.3.6.1.2.1.25.3.3.1.2.196615"
[[inputs.snmp.field]]
  name = "Total_RAM"
  oid = "1.3.6.1.4.1.24681.1.3.2.0"
[[inputs.snmp.field]]
  name = "Free_RAM"
  oid = "1.3.6.1.4.1.24681.1.3.3.0"
[[inputs.snmp.field]]
  name = "Fan1_Speed"
  oid = "1.3.6.1.4.1.24681.1.3.15.1.3.1"
[[inputs.snmp.field]]
  name = "Fan2_Speed"
  oid = "1.3.6.1.4.1.24681.1.3.15.1.3.2"
[[inputs.snmp.field]]
  name = "System_Temp"
  oid = "1.3.6.1.4.1.24681.1.3.6.0"

[[inputs.snmp.field]]
    name = "hostname"
    oid = "1.3.6.1.2.1.1.5.0"
    is_tag = true

[[inputs.snmp.table]]
    name = "ifTable"
    inherit_tags = [ "hostname" ]
    [[inputs.snmp.table.field]]
      name = "Interface"
      oid = "1.3.6.1.2.1.2.2.1.2"
      is_tag = true
    [[inputs.snmp.table.field]]
      name = "RXBytes"
      oid = "1.3.6.1.2.1.2.2.1.10"
    [[inputs.snmp.table.field]]
      name = "TXBytes"
      oid = "1.3.6.1.2.1.2.2.1.16"

[[inputs.snmp]]
  agents = [ "192.168.5.10:161"]
  timeout = "1s"
  interval = "10m"
  retries = 3
  version = 2
  community = "public"
  max_repetitions = 10

  [[inputs.snmp.table]]
    name = "HDDTable"
    inherit_tags = [ "hostname" ]
    [[inputs.snmp.table.field]]
      name = "HDDDescription"
      oid = "1.3.6.1.4.1.24681.1.3.11.1.2"
      is_tag = true
    [[inputs.snmp.table.field]]
      name = "Temperature"
      oid = "1.3.6.1.4.1.24681.1.3.11.1.3"
    [[inputs.snmp.table.field]]
      name = "Status"
      oid = "1.3.6.1.4.1.24681.1.3.11.1.4"
    [[inputs.snmp.table.field]]
      name = "S.M.A.R.T."
      oid = "1.3.6.1.4.1.24681.1.3.11.1.7"

[[inputs.snmp.table]]
    name = "poolTable"
    inherit_tags = [ "hostname" ]
    [[inputs.snmp.table.field]]
      name = "poolID"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.2"
      is_tag = true
    [[inputs.snmp.table.field]]
      name = "poolCapacity"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.3"
    [[inputs.snmp.table.field]]
      name = "poolFreeSize"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.4"

    [[inputs.snmp.table.field]]
      name = "poolStatus"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.2.2.1.5"

[[inputs.snmp.table]]
    name = "volumeTable"
    inherit_tags = [ "hostname" ]
    [[inputs.snmp.table.field]]
      name = "volumeName"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.8"
      is_tag = true
    [[inputs.snmp.table.field]]
      name = "volumeCapacity"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.3"
    [[inputs.snmp.table.field]]
      name = "volumeFreeSize"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.4"
    [[inputs.snmp.table.field]]
      name = "volumeStatus"
      oid = "1.3.6.1.4.1.24681.1.4.1.1.1.2.3.2.1.5"

[[outputs.file]]
files = ["stdout"]
data_format = "influx"
jostrasser commented 2 years ago

Hi @reimda Thanks for your feedback, hints and your help!

Sorry about the incomplete config and logs. Ive now replaced the added config with the full one.

I am writing the data into influxdb (1.8.10)... I can try your config to see if I can see any difference. I also will check the logs afterwards to deliver you a better view of my problem.

Thanks!

jostrasser commented 2 years ago

Hi @reimda I have now done some tests and found the root cause for my issue. It looks like a timing issue since upgrading to a newer version of telegraf (1.20.4 is the last working version).

I´ve done the following, I have used my unchanged telegraf.conf:

1) upgraded from 1.20.4 to 1.21.3 2) restarted telegraf service via systemctl 3) telegraf is collection as before 4) restarted system 5) telegraf stops collection

Here are some logs:

After service restart:

Feb  5 22:49:07 snmpdb systemd[1]: Failed to attach 237 to compat systemd cgroup /system.slice/telegraf.service: No such file or directory
Feb  5 22:49:07 snmpdb systemd[1]: Started The plugin-driven server agent for reporting metrics into InfluxDB.
Feb  5 22:49:07 snmpdb systemd[237]: Failed to attach 237 to compat systemd cgroup /system.slice/telegraf.service: No such file or directory
Feb  5 22:49:07 snmpdb telegraf[237]: 2022-02-05T21:49:07Z I! Starting Telegraf 1.21.3
Feb  5 22:49:07 snmpdb telegraf[237]: 2022-02-05T21:49:07Z I! Loaded inputs: snmp (2x)
Feb  5 22:49:07 snmpdb telegraf[237]: 2022-02-05T21:49:07Z I! Loaded aggregators:
Feb  5 22:49:07 snmpdb telegraf[237]: 2022-02-05T21:49:07Z I! Loaded processors:
Feb  5 22:49:07 snmpdb telegraf[237]: 2022-02-05T21:49:07Z I! Loaded outputs: influxdb
Feb  5 22:49:07 snmpdb telegraf[237]: 2022-02-05T21:49:07Z I! Tags enabled: host=snmpdb
Feb  5 22:49:07 snmpdb telegraf[237]: 2022-02-05T21:49:07Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"snmpdb", Flush Interval:10s
Feb  5 22:49:07 snmpdb telegraf[237]: 2022-02-05T21:49:07Z W! [inputs.snmp] No mibs found
Feb  5 22:49:07 snmpdb telegraf[237]: 2022-02-05T21:49:07Z W! [inputs.snmp] MIB path doesn't exist: "/usr/share/snmp/mibs"

After reboot:

Feb  5 22:47:11 snmpdb systemd[1]: Started Session 16643 of user root.
Feb  5 22:47:15 snmpdb telegraf[120]: 2022-02-05T21:47:15Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Feb  5 22:47:15 snmpdb telegraf[120]: 2022-02-05T21:47:15Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Feb  5 22:47:17 snmpdb systemd[1]: Started Session 16645 of user root.
Feb  5 22:47:30 snmpdb telegraf[120]: 2022-02-05T21:47:30Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Feb  5 22:47:30 snmpdb telegraf[120]: 2022-02-05T21:47:30Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Feb  5 22:47:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Feb  5 22:47:31 snmpdb systemd[1]: systemd-hostnamed.service: Succeeded.
Feb  5 22:47:45 snmpdb telegraf[120]: 2022-02-05T21:47:45Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Feb  5 22:47:45 snmpdb telegraf[120]: 2022-02-05T21:47:45Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Feb  5 22:48:00 snmpdb telegraf[120]: 2022-02-05T21:48:00Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Feb  5 22:48:00 snmpdb telegraf[120]: 2022-02-05T21:48:00Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect()

I have added the following to the telegraf.service to delay the start of telegraf:

[Service]
ExecStartPre=/bin/sleep 60

After that telegraf resumes collection if I restart my LXD container.

I have no idea why this only occurs on newer versions of telegraf after version 1.20.4.

Here the following log after implementation of the workaround and restarting the container, I hope this can help:

Feb  5 23:05:30 snmpdb systemd[1]: Starting Flush Journal to Persistent Storage...
Feb  5 23:05:30 snmpdb systemd[1]: Finished Flush Journal to Persistent Storage.
Feb  5 23:05:30 snmpdb systemd[1]: Starting Create Volatile Files and Directories...
Feb  5 23:05:30 snmpdb systemd-networkd[71]: eth0: IPv6 successfully enabled
Feb  5 23:05:30 snmpdb systemd-networkd[71]: eth0: Cannot disable kernel IPv6 accept_ra for interface: Read-only file system
Feb  5 23:05:30 snmpdb systemd-networkd[71]: eth0: cannot set sysctl net/ipv4/conf/eth0/promote_secondaries to 1
Feb  5 23:05:30 snmpdb systemd[1]: Finished Create Volatile Files and Directories.
Feb  5 23:05:30 snmpdb systemd[1]: Starting Update UTMP about System Boot/Shutdown...
Feb  5 23:05:30 snmpdb systemd[80]: Failed to attach 80 to compat systemd cgroup /system.slice/influxdb.service: No such file or directory
Feb  5 23:05:30 snmpdb systemd[1]: Started System Logging Service.
Feb  5 23:05:30 snmpdb rsyslogd: imuxsock: Acquired UNIX socket '/run/systemd/journal/syslog' (fd 3) from systemd.  [v8.2001.0]
Feb  5 23:05:30 snmpdb systemd[84]: Failed to attach 84 to compat systemd cgroup /system.slice/systemd-resolved.service: No such file or directory
Feb  5 23:05:30 snmpdb rsyslogd: rsyslogd's groupid changed to 110
Feb  5 23:05:30 snmpdb rsyslogd: rsyslogd's userid changed to 104
Feb  5 23:05:30 snmpdb rsyslogd: [origin software="rsyslogd" swVersion="8.2001.0" x-pid="82" x-info="https://www.rsyslog.com"] start
Feb  5 23:05:30 snmpdb rsyslogd: omfwd/udp: socket 7: sendto() error: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd: socket 7: error 101 sending via udp: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2001.0 try https://www.rsyslog.com/e/2007 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd/udp: socket 7: sendto() error: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd: socket 7: error 101 sending via udp: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2001.0 try https://www.rsyslog.com/e/2007 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd/udp: socket 7: sendto() error: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd: socket 7: error 101 sending via udp: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd/udp: socket 7: sendto() error: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd/udp: socket 7: sendto() error: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd: socket 7: error 101 sending via udp: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd/udp: socket 7: sendto() error: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd: socket 7: error 101 sending via udp: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2001.0 try https://www.rsyslog.com/e/2007 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd/udp: socket 7: sendto() error: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd: socket 7: error 101 sending via udp: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2001.0 try https://www.rsyslog.com/e/2007 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd/udp: socket 7: sendto() error: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd: socket 7: error 101 sending via udp: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2001.0 try https://www.rsyslog.com/e/2007 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd/udp: socket 7: sendto() error: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd: socket 7: error 101 sending via udp: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2001.0 try https://www.rsyslog.com/e/2007 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd/udp: socket 7: sendto() error: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: omfwd: socket 7: error 101 sending via udp: Network is unreachable [v8.2001.0 try https://www.rsyslog.com/e/2354 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2001.0 try https://www.rsyslog.com/e/2007 ]
Feb  5 23:05:30 snmpdb rsyslogd: action 'action-6-builtin:omfwd' suspended (module 'builtin:omfwd'), next retry is Sat Feb  5 23:06:00 2022, retry nbr 0. There should be messages before this one giving the reason for suspension. [v8.2001.0 try https://www.rsyslog.com/e/2007 ]
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[90]: Merging with configuration at: /etc/influxdb/influxdb.conf
Feb  5 23:05:30 snmpdb systemd[1]: Started Login Service.
Feb  5 23:05:30 snmpdb systemd-resolved[84]: Positive Trust Anchors:
Feb  5 23:05:30 snmpdb systemd-resolved[84]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
Feb  5 23:05:30 snmpdb systemd-resolved[84]: Positive Trust Anchors:
Feb  5 23:05:30 snmpdb systemd-resolved[84]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
Feb  5 23:05:30 snmpdb systemd-resolved[84]: Negative trust anchors: 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 26.172.in-addr.arpa 27.172.in-addr.arpa 28.172.in-addr.arpa 29.172.in-addr.arpa 30.172.in-addr.arpa 31.172.in-addr.arpa 168.192.in-addr.arpa d.f.ip6.arpa corp home internal intranet lan local private test
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.318369Z lvl=info msg="InfluxDB starting" log_id=0ZUz_TJW000 version=1.8.10 branch=1.8 commit=688e697c51fd
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.318402Z lvl=info msg="Go runtime" log_id=0ZUz_TJW000 version=go1.13.8 maxprocs=8
Feb  5 23:05:30 snmpdb systemd-resolved[84]: Using system hostname 'snmpdb'.
Feb  5 23:05:30 snmpdb systemd[1]: Started Network Name Resolution.
Feb  5 23:05:30 snmpdb systemd[1]: Reached target Network.
Feb  5 23:05:30 snmpdb systemd[1]: Reached target Host and Network Name Lookups.
Feb  5 23:05:30 snmpdb systemd[1]: Failed to attach 120 to compat systemd cgroup /system.slice/ssh.service: No such file or directory
Feb  5 23:05:30 snmpdb systemd[1]: Starting OpenBSD Secure Shell server...
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[111]: Merging with configuration at: /etc/influxdb/influxdb.conf
Feb  5 23:05:30 snmpdb systemd[120]: Failed to attach 120 to compat systemd cgroup /system.slice/ssh.service: No such file or directory
Feb  5 23:05:30 snmpdb systemd[1]: Starting Permit User Sessions...
Feb  5 23:05:30 snmpdb systemd[1]: Failed to attach 122 to compat systemd cgroup /system.slice/telegraf.service: No such file or directory
Feb  5 23:05:30 snmpdb systemd[1]: Starting The plugin-driven server agent for reporting metrics into InfluxDB...
Feb  5 23:05:30 snmpdb systemd[122]: Failed to attach 122 to compat systemd cgroup /system.slice/telegraf.service: No such file or directory
Feb  5 23:05:30 snmpdb systemd[1]: Finished Permit User Sessions.
Feb  5 23:05:30 snmpdb systemd[1]: Failed to attach 124 to compat systemd cgroup /system.slice/console-getty.service: No such file or directory
Feb  5 23:05:30 snmpdb systemd[1]: Started Console Getty.
Feb  5 23:05:30 snmpdb systemd[1]: Condition check resulted in Set console scheme being skipped.
Feb  5 23:05:30 snmpdb systemd[1]: Created slice system-getty.slice.
Feb  5 23:05:30 snmpdb systemd[1]: Condition check resulted in Getty on tty1 being skipped.
Feb  5 23:05:30 snmpdb systemd[1]: Reached target Login Prompts.
Feb  5 23:05:30 snmpdb systemd[1]: Failed to attach 126 to compat systemd cgroup /system.slice/ssh.service: No such file or directory
Feb  5 23:05:30 snmpdb systemd[126]: Failed to attach 126 to compat systemd cgroup /system.slice/ssh.service: No such file or directory
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[80]: InfluxDB API unavailable after 1 attempts...
Feb  5 23:05:30 snmpdb systemd[1]: Started OpenBSD Secure Shell server.
Feb  5 23:05:30 snmpdb networkd-dispatcher[81]: No valid path found for iwconfig
Feb  5 23:05:30 snmpdb networkd-dispatcher[81]: No valid path found for iw
Feb  5 23:05:30 snmpdb systemd[1]: Started Dispatcher daemon for systemd-networkd.
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.418982Z lvl=info msg="Using data dir" log_id=0ZUz_TJW000 service=store path=/snmpdb/data
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.419006Z lvl=info msg="Compaction settings" log_id=0ZUz_TJW000 service=store max_concurrent_compactions=4 throughput_bytes_per_second=50331648 throughput_bytes_per_second_burst=50331648
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.419020Z lvl=info msg="Open store (start)" log_id=0ZUz_TJW000 service=store trace_id=0ZUz_Thl000 op_name=tsdb_open op_event=start
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.419006Z lvl=info msg="Compaction settings" log_id=0ZUz_TJW000 service=store max_concurrent_compactions=4 throughput_bytes_per_second=50331648 throughput_bytes_per_second_burst=50331648
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.419020Z lvl=info msg="Open store (start)" log_id=0ZUz_TJW000 service=store trace_id=0ZUz_Thl000 op_name=tsdb_open op_event=start
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.437058Z lvl=info msg="Opened file" log_id=0ZUz_TJW000 engine=tsm1 service=filestore path=/snmpdb/data/_internal/monitor/1927/000000016-000000002.tsm id=1 duration=1.168ms
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.437429Z lvl=info msg="Opened file" log_id=0ZUz_TJW000 engine=tsm1 service=filestore path=/snmpdb/data/_internal/monitor/1921/000000020-000000002.tsm id=0 duration=1.623ms
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.437478Z lvl=info msg="Opened file" log_id=0ZUz_TJW000 engine=tsm1 service=filestore path=/snmpdb/data/_internal/monitor/1924/000000019-000000002.tsm id=0 duration=1.582ms
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.437510Z lvl=info msg="Opened file" log_id=0ZUz_TJW000 engine=tsm1 service=filestore path=/snmpdb/data/_internal/monitor/1925/000000019-000000002.tsm id=0 duration=1.675ms
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.437591Z lvl=info msg="Opened file" log_id=0ZUz_TJW000 engine=tsm1 service=filestore path=/snmpdb/data/_internal/monitor/1920/000000019-000000002.tsm id=0 duration=1.713ms
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.438395Z lvl=info msg="Opened file" log_id=0ZUz_TJW000 engine=tsm1 service=filestore path=/snmpdb/data/_internal/monitor/1926/000000019-000000002.
.
.
.
.
.
.
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.575006Z lvl=info msg="Opened shard" log_id=0ZUz_TJW000 service=store trace_id=0ZUz_Thl000 op_name=tsdb_open index_version=inmem path=/snmpdb/data/snmpdb/retention1y/1740 duration=11.844ms
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.576619Z lvl=info msg="Opened shard" log_id=0ZUz_TJW000 service=store trace_id=0ZUz_Thl000 op_name=tsdb_open index_version=inmem path=/snmpdb/data/snmpdb/retention1y/1748 duration=10.108ms
Feb  5 23:05:30 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:30.952190Z lvl=info msg="Reading file" log_id=0ZUz_TJW000 engine=tsm1 service=cacheloader path=/snmpdb/wal/_internal/monitor/1927/_00066.wal size=10522536
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[80]: InfluxDB API unavailable after 2 attempts...
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.445034Z lvl=info msg="Reading file" log_id=0ZUz_TJW000 engine=tsm1 service=cacheloader path=/snmpdb/wal/_internal/monitor/1927/_00067.wal size=3632926
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.644755Z lvl=info msg="Opened shard" log_id=0ZUz_TJW000 service=store trace_id=0ZUz_Thl000 op_name=tsdb_open index_version=inmem path=/snmpdb/data/_internal/monitor/1927 duration=1209.316ms
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646777Z lvl=info msg="Open store (end)" log_id=0ZUz_TJW000 service=store trace_id=0ZUz_Thl000 op_name=tsdb_open op_event=end op_elapsed=1227.758ms
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646832Z lvl=info msg="Opened service" log_id=0ZUz_TJW000 service=subscriber
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646840Z lvl=info msg="Starting monitor service" log_id=0ZUz_TJW000 service=monitor
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646846Z lvl=info msg="Registered diagnostics client" log_id=0ZUz_TJW000 service=monitor name=build
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646850Z lvl=info msg="Registered diagnostics client" log_id=0ZUz_TJW000 service=monitor name=runtime
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646854Z lvl=info msg="Registered diagnostics client" log_id=0ZUz_TJW000 service=monitor name=network
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646861Z lvl=info msg="Registered diagnostics client" log_id=0ZUz_TJW000 service=monitor name=system
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646876Z lvl=info msg="Starting precreation service" log_id=0ZUz_TJW000 service=shard-precreation check_interval=10m advance_period=30m
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646889Z lvl=info msg="Starting snapshot service" log_id=0ZUz_TJW000 service=snapshot
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646900Z lvl=info msg="Starting continuous query service" log_id=0ZUz_TJW000 service=continuous_querier
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646906Z lvl=info msg="Starting HTTP service" log_id=0ZUz_TJW000 service=httpd authentication=false
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.646981Z lvl=info msg="Listening on HTTP" log_id=0ZUz_TJW000 service=httpd addr=[::]:8086 https=false
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.647001Z lvl=info msg="Starting retention policy enforcement service" log_id=0ZUz_TJW000 service=retention check_interval=30m
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.647005Z lvl=info msg="Storing statistics" log_id=0ZUz_TJW000 service=monitor db_instance=_internal db_rp=monitor interval=10s
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.647044Z lvl=info msg="Listening for signals" log_id=0ZUz_TJW000
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.647005Z lvl=info msg="Storing statistics" log_id=0ZUz_TJW000 service=monitor db_instance=_internal db_rp=monitor interval=10s
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.647044Z lvl=info msg="Listening for signals" log_id=0ZUz_TJW000
Feb  5 23:05:31 snmpdb influxd-systemd-start.sh[88]: ts=2022-02-05T22:05:31.648149Z lvl=info msg="Sending usage statistics to usage.influxdata.com" log_id=0ZUz_TJW000
Feb  5 23:05:32 snmpdb influxd-systemd-start.sh[80]: InfluxDB started
Feb  5 23:05:32 snmpdb systemd[1]: Started InfluxDB is an open-source, distributed, time series database.
Feb  5 23:05:32 snmpdb systemd-networkd[71]: eth0: Gained IPv6LL
Feb  5 23:05:35 snmpdb systemd[124]: Failed to attach 124 to compat systemd cgroup /system.slice/console-getty.service: No such file or directory
Feb  5 23:05:35 snmpdb systemd[1]: dmesg.service: Succeeded.
Feb  5 23:06:01 snmpdb systemd-networkd[71]: eth0: DHCPv4 address 192.168.5.112/24 via 192.168.5.1
Feb  5 23:06:01 snmpdb dbus-daemon[78]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.0' (uid=100 pid=71 comm="/lib/systemd/systemd-networkd ")
Feb  5 23:06:01 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Feb  5 23:06:01 snmpdb systemd[1]: Starting Hostname Service...
Feb  5 23:06:01 snmpdb dbus-daemon[78]: [system] Successfully activated service 'org.freedesktop.hostname1'
Feb  5 23:06:01 snmpdb systemd[1]: Started Hostname Service.
Feb  5 23:06:11 snmpdb systemd[1]: Created slice User Slice of UID 0.
Feb  5 23:06:11 snmpdb systemd[1]: Starting User Runtime Directory /run/user/0...
Feb  5 23:06:11 snmpdb systemd[1]: Finished User Runtime Directory /run/user/0.
Feb  5 23:06:11 snmpdb systemd[1]: Starting User Manager for UID 0...
Feb  5 23:06:11 snmpdb systemd[155]: Reached target Paths.
Feb  5 23:06:11 snmpdb systemd[155]: Reached target Timers.
Feb  5 23:06:11 snmpdb systemd[155]: Listening on GnuPG network certificate management daemon.
Feb  5 23:06:11 snmpdb systemd[155]: Listening on GnuPG cryptographic agent and passphrase cache (access for web browsers).
Feb  5 23:06:11 snmpdb systemd[155]: Listening on GnuPG cryptographic agent and passphrase cache (restricted).
Feb  5 23:06:11 snmpdb systemd[155]: Listening on GnuPG cryptographic agent (ssh-agent emulation).
Feb  5 23:06:11 snmpdb systemd[155]: Listening on GnuPG cryptographic agent and passphrase cache.
Feb  5 23:06:11 snmpdb systemd[155]: Reached target Sockets.
Feb  5 23:06:11 snmpdb systemd[155]: Reached target Basic System.
Feb  5 23:06:11 snmpdb systemd[155]: Reached target Main User Target.
Feb  5 23:06:11 snmpdb systemd[155]: Startup finished in 17ms.
Feb  5 23:06:11 snmpdb systemd[1]: Started User Manager for UID 0.
Feb  5 23:06:11 snmpdb systemd[1]: Started Session 16654 of user root.
Feb  5 23:06:30 snmpdb systemd[1]: Failed to attach 190 to compat systemd cgroup /system.slice/telegraf.service: No such file or directory
Feb  5 23:06:30 snmpdb systemd[1]: Started The plugin-driven server agent for reporting metrics into InfluxDB.
Feb  5 23:06:30 snmpdb systemd[1]: Reached target Multi-User System.
Feb  5 23:06:30 snmpdb systemd[1]: Reached target Graphical Interface.
Feb  5 23:06:30 snmpdb systemd[1]: Starting Update UTMP about System Runlevel Changes...
Feb  5 23:06:30 snmpdb systemd[190]: Failed to attach 190 to compat systemd cgroup /system.slice/telegraf.service: No such file or directory
Feb  5 23:06:30 snmpdb systemd[1]: systemd-update-utmp-runlevel.service: Succeeded.
Feb  5 23:06:30 snmpdb systemd[1]: Finished Update UTMP about System Runlevel Changes.
Feb  5 23:06:30 snmpdb systemd[1]: Startup finished in 1min 219ms.
Feb  5 23:06:30 snmpdb telegraf[190]: 2022-02-05T22:06:30Z I! Starting Telegraf 1.21.3
Feb  5 23:06:30 snmpdb telegraf[190]: 2022-02-05T22:06:30Z I! Loaded inputs: snmp (2x)
Feb  5 23:06:30 snmpdb telegraf[190]: 2022-02-05T22:06:30Z I! Loaded aggregators:
Feb  5 23:06:30 snmpdb telegraf[190]: 2022-02-05T22:06:30Z I! Loaded processors:
Feb  5 23:06:30 snmpdb telegraf[190]: 2022-02-05T22:06:30Z I! Loaded aggregators:
Feb  5 23:06:30 snmpdb telegraf[190]: 2022-02-05T22:06:30Z I! Loaded processors:
Feb  5 23:06:30 snmpdb telegraf[190]: 2022-02-05T22:06:30Z I! Loaded outputs: influxdb
Feb  5 23:06:30 snmpdb telegraf[190]: 2022-02-05T22:06:30Z I! Tags enabled: host=snmpdb
Feb  5 23:06:30 snmpdb telegraf[190]: 2022-02-05T22:06:30Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"snmpdb", Flush Interval:10s
Feb  5 23:06:30 snmpdb telegraf[190]: 2022-02-05T22:06:30Z W! [inputs.snmp] No mibs found
Feb  5 23:06:30 snmpdb telegraf[190]: 2022-02-05T22:06:30Z W! [inputs.snmp] MIB path doesn't exist: "/usr/share/snmp/mibs"
Feb  5 23:06:31 snmpdb systemd[1]: systemd-hostnamed.service: Succeeded.

Thanks and BR/JO!

jostrasser commented 2 years ago

Tested Telegraf 1.21.4 (git: HEAD 583ee20a) but can see the same behavior.

Hipska commented 2 years ago

It looks like snmp first connects to the device, and then gets disconnected or something? Would you also be able to share a packet capture with tcpdump?

Please also give us the (relevant) logs when running telegraf 1.21.4 in debug mode. (So please only telegraf logs, not other system logs)

jostrasser commented 2 years ago

It looks like snmp first connects to the device, and then gets disconnected or something? Would you also be able to share a packet capture with tcpdump?

Please also give us the (relevant) logs when running telegraf 1.21.4 in debug mode. (So please only telegraf logs, not other system logs)

Hi, I think this is not easy because I have to restart the system to reproduce the issue... Any ideas how I can do this from the procedure perspective? Thanks!

Hipska commented 2 years ago

Wait, that's not clear. So it is working now? But not directly after a restart?

jostrasser commented 2 years ago

Wait, that's not clear. So it is working now? But not directly after a restart?

Correct, see: https://github.com/influxdata/telegraf/issues/10554#issuecomment-1030709029

Hipska commented 2 years ago

I have read that post multiple times, but I don't know how to interpret it.

Anyway, now it looks more like we are dealing with race conditions that occur during system boot. (Maybe network not completely ready somehow)

Just to be clear, when having this issue, the issue remains until telegraf service is restarted? It doesn't recover automatically after a while?

jostrasser commented 2 years ago

Just to be clear, when having this issue, the issue remains until telegraf service is restarted? It doesn't recover automatically after a while?

Yes, that's correct. I have to manually restart the telegraf service afterward (or delaying the start). It is possible that the networking take some time to came up but it is strange that this only occurs on versions after 1.20.4.

Hipska commented 2 years ago

@MyaLongmire could we catch this error and try reconnecting when it happens? Or what other method would workaround this specific problem?

bondskin commented 2 years ago

thanks @sjwang90 for taking care. Last good version for me (running on Raspberry Pi4 (5.10.103-v8+ )was telegraf_1.20.4-1_armhf.deb

With 1.22.0 I get now this:

Mar 25 10:46:00 raspberrypidns telegraf[16226]: 2022-03-25T09:46:00Z E! [inputs.snmp] Error in plugin: agent dreamer.ip: gathering table hrProcessorTable: performing
bulk walk for field hrProcessorFrwID: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Mar 25 10:46:00 raspberrypidns telegraf[16226]: 2022-03-25T09:46:00Z E! [inputs.snmp] Error in plugin: agent dreamer.ip: gathering table laTable: performing bulk walk
for field laNames: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Mar 25 10:46:00 raspberrypidns telegraf[16226]: 2022-03-25T09:46:00Z E! [inputs.snmp] Error in plugin: agent area51.ip: performing get on field sysName: &GoSNMP.Conn
is missing. Provide a connection or use Connect()
jostrasser commented 2 years ago

Hi all, Telegraf 1.22.0 (https://github.com/influxdata/telegraf/releases/tag/v1.22.0) is having the same issue.

I have the workaround applied, is working fine:

edited /lib/systemd/system/telegraf.service

added

[Service]
ExecStartPre=/bin/sleep 60

It is 100% related to the network service startup in my LXD containers, but there must be a difference between 1.20.4 and all later releases.

bondskin commented 2 years ago

Hi @jostrasser , nice to see you here. For me, the issue is different. I don't use Docker/container hence the start issue does not occur for me. But same as you, last good version was 1.20.4

However, running Telegraf is producing these errors:

ar 27 13:07:00 raspberrypidns telegraf[10020]: 2022-03-27T11:07:00Z E! [inputs.snmp] Error in plugin: agent millenniumfalcon.ip: performing get on field sysName: &Go
SNMP.Conn is missing. Provide a connection or use Connect()
Mar 27 13:07:00 raspberrypidns telegraf[10020]: 2022-03-27T11:07:00Z E! [inputs.snmp] Error in plugin: agent millenniumfalcon.ip: gathering table ifTable: performing
bulk walk for field ifDescr: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Mar 27 13:07:00 raspberrypidns telegraf[10020]: 2022-03-27T11:07:00Z E! [inputs.snmp] Error in plugin: agent millenniumfalcon.ip: gathering table unifiRadioTable: per
forming bulk walk for field unifiRadioName: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Mar 27 13:07:00 raspberrypidns telegraf[10020]: 2022-03-27T11:07:00Z E! [inputs.snmp] Error in plugin: agent millenniumfalcon.ip: gathering table unifiVapTable: perfo
rming bulk walk for field unifiVapName: &GoSNMP.Conn is missing. Provide a connection or use Connect()
jostrasser commented 2 years ago

Hi @bondskin

Can you confirm that your issue also occurs when you stop and start the service manually and not only immediately after a system restart (in short: permanently)?

Thanks!

bondskin commented 2 years ago

yes, @jostrasser , confirmed. Stopping and starting again does not fix the "&GoSNMP.Conn is missing. Provide a connection or use Connect()" error. At least, Telegraf keeps running, which was not the case with 1.21.x.

svestenik commented 2 years ago

yes, @jostrasser , confirmed. Stopping and starting again does not fix the "&GoSNMP.Conn is missing. Provide a connection or use Connect()" error. At least, Telegraf keeps running, which was not the case with 1.21.x.

Same here. Telegraf version 1.22, running on CentOS 8 stream. Lot of different SNMP input plugins stopped working after update to 1.22 with same error:

2022-04-06T08:10:03Z E! [inputs.snmp::Inventory] Error in plugin: agent 192.168.25.52:161: gathering table VRF_interface: performing bulk walk for field vrfIntType: &GoSNMP.Conn is missing. Provide a connection or use Connect()

Restarting Telegraf does not help.

bondskin commented 2 years ago

good morning, no change with Telegraf 1.22.1 (git: HEAD fc8301ae)

svestenik commented 2 years ago

Same here. Stuck between a rock and a hard place... Still getting the error &SNMP.Conn is missing with 1.22.1 on some oid's, but if I roll back, then I am not able to resolve some other OID's in 1.21.x which are working properly in 1.22.x....

MyaLongmire commented 2 years ago

Sorry for the delay in getting back to you :). This comment on the GoSNMP repo makes me think we just need to implement a setting which has been done in this pr if you wouldn't mind giving it a test.

svestenik commented 2 years ago

Sorry for the delay in getting back to you :). This comment on the GoSNMP repo makes me think we just need to implement a setting which has been done in this pr if you wouldn't mind giving it a test.

After reading the original thread I think it actually makes a lot of sense in my case. Anxiously waiting for update on plugin :).

Hipska commented 2 years ago

@svestenik can you test with the builds from the PR to see if it actually fixes the problem?

jostrasser commented 2 years ago

Hi all, no changes with Telegraf 1.22.2 (git: HEAD a60db9ba)

Hipska commented 2 years ago

Correct, the PR hasn’t been merged yet. Could you please test with the built artifacts from there?

jostrasser commented 2 years ago

Hi @Hipska sure, but I need some assistance how to install and revert to the official 1.22.2 on Ubuntu please. The actual 1.22.2 is installed via apt.

I tired to pull the version for Ubuntu, but I got:

root@snmpdb:/tmp# wget https://156688-33258973-gh.circle-artifacts.com/0/build/dist/telegraf_1.22.0%7Ea1db1d8d-0_amd64.deb
--2022-04-26 07:10:18--  https://156688-33258973-gh.circle-artifacts.com/0/build/dist/telegraf_1.22.0%7Ea1db1d8d-0_amd64.deb
Resolving 156688-33258973-gh.circle-artifacts.com (156688-33258973-gh.circle-artifacts.com)... 35.171.124.127, 52.200.33.173, 54.209.158.153
Connecting to 156688-33258973-gh.circle-artifacts.com (156688-33258973-gh.circle-artifacts.com)|35.171.124.127|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2022-04-26 07:10:19 ERROR 404: Not Found.

Thanks!

Hipska commented 2 years ago

Yeah indeed, the artifacts are expired in the meantime. I initiated a new build just now, the bot will inform you of the new download links.

jostrasser commented 2 years ago

Yeah indeed, the artifacts are expired in the meantime. I initiated a new build just now, the bot will inform you of the new download links.

Thanks. Please instruct me to install / revert to the official release via apt. Thanks!

svestenik commented 2 years ago

Installed the branch build from rpm. Didn't go too well... My system load after starting Telegraf went to 56.07 :) And nothing gets written into InfluxDB. Centos8 Stream, InfluxDB version 1.8.10.

When I check telegraf log file, I see it doing nothing:

tail /var/log/telegraf/telegraf.log

2022-04-26T09:48:27Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"", Flush Interval:10s 2022-04-26T09:48:27Z D! [agent] Initializing plugins 2022-04-26T09:51:41Z I! Starting Telegraf 1.22.0-a1db1d8d 2022-04-26T09:51:41Z I! Loaded inputs: cpu disk diskio exec (2x) kernel mem ping (2x) processes snmp (27x) sqlserver swap system x509_cert 2022-04-26T09:51:41Z I! Loaded aggregators: 2022-04-26T09:51:41Z I! Loaded processors: converter (15x) regex (11x) rename (7x) starlark strings (12x) 2022-04-26T09:51:41Z I! Loaded outputs: influxdb 2022-04-26T09:51:41Z I! Tags enabled: 2022-04-26T09:51:41Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"", Flush Interval:10s 2022-04-26T09:51:41Z D! [agent] Initializing plugins and just hangs there instead of progressing to input processing.

Hipska commented 2 years ago

I would report that in the respective PR..

bondskin commented 2 years ago

Good morning, I installed now 1.22.3 (git: HEAD ff950615) on my Pi and it has improved. In my scenario, I am still getting that error message, but only for one specific client (Unifi AP Mesh6). Strangely, all other Unifi APs are behaving correctly and not producing this error (firmware is all the same 6.0.18), I'll report this to Unifi, may be there is something different with their SNMP implementation.

Apr 30 09:41:01 raspberrypidns telegraf[13301]: 2022-04-30T07:41:01Z E! [inputs.snmp] Error in plugin: agent area51.ip: gathering table hrProcessorTable: performing bulk walk for field hrProcessorFrwID: &GoSNMP.Conn is missing. Provide a connection or use Connect()

@jostrasser, do you have a similar experience?

jostrasser commented 2 years ago

Hi @bondskin ,

nope, but the messages are different now. Without the ExecStartPre=/bin/sleep 60 the service isn't coming up:

Apr 30 12:06:28 snmpdb systemd[1]: Started InfluxDB is an open-source, distributed, time series database.
Apr 30 12:06:28 snmpdb systemd[1]: Reached target Multi-User System.
Apr 30 12:06:28 snmpdb systemd[1]: Reached target Graphical Interface.
Apr 30 12:06:28 snmpdb systemd[1]: Starting Update UTMP about System Runlevel Changes...
Apr 30 12:06:28 snmpdb systemd[1]: systemd-update-utmp-runlevel.service: Succeeded.
Apr 30 12:06:28 snmpdb systemd[1]: Finished Update UTMP about System Runlevel Changes.
Apr 30 12:06:28 snmpdb systemd[1]: Startup finished in 2.295s.
Apr 30 12:06:28 snmpdb systemd[120]: Failed to attach 120 to compat systemd cgroup /system.slice/console-getty.service: No such file or directory
Apr 30 12:06:28 snmpdb systemd[1]: dmesg.service: Succeeded.
Apr 30 12:06:30 snmpdb telegraf[119]: 2022-04-30T10:06:30Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: setting up connection: error establishing connection to host: dial udp :0->192.168.5.10:161: connect: network is unreachable
Apr 30 12:06:41 snmpdb kernel: [703991.024044] qvs0: port 8(veth2afb66c6) entered learning state
Apr 30 12:06:45 snmpdb telegraf[119]: 2022-04-30T10:06:45Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Apr 30 12:06:45 snmpdb telegraf[119]: 2022-04-30T10:06:45Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Apr 30 12:06:58 snmpdb systemd-networkd[70]: eth0: DHCPv4 address 192.168.5.112/24 via 192.168.5.1
Apr 30 12:06:58 snmpdb dbus-daemon[78]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.0' (uid=100 pid=70 comm="/lib/systemd/systemd-networkd ")
Apr 30 12:06:58 snmpdb rsyslogd: action 'action-6-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2001.0 try https://www.rsyslog.com/e/2359 ]
Apr 30 12:06:58 snmpdb systemd[1]: Starting Hostname Service...
Apr 30 12:06:58 snmpdb dbus-daemon[78]: [system] Successfully activated service 'org.freedesktop.hostname1'
Apr 30 12:06:58 snmpdb systemd[1]: Started Hostname Service.
Apr 30 12:07:00 snmpdb telegraf[119]: 2022-04-30T10:07:00Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Apr 30 12:07:00 snmpdb telegraf[119]: 2022-04-30T10:07:00Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Apr 30 12:07:15 snmpdb telegraf[119]: 2022-04-30T10:07:15Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Apr 30 12:07:15 snmpdb telegraf[119]: 2022-04-30T10:07:15Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Apr 30 12:07:28 snmpdb systemd[1]: systemd-hostnamed.service: Succeeded.
Apr 30 12:07:30 snmpdb telegraf[119]: 2022-04-30T10:07:30Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: performing get on field CPU_Load: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Apr 30 12:07:30 snmpdb telegraf[119]: 2022-04-30T10:07:30Z E! [inputs.snmp] Error in plugin: agent 192.168.5.10:161: gathering table ifTable: performing bulk walk for field Interface: &GoSNMP.Conn is missing. Provide a connection or use Connect()
Apr 30 12:07:30 snmpdb systemd[1]: Created slice User Slice of UID 0.
Apr 30 12:07:30 snmpdb systemd[1]: Starting User Runtime Directory /run/user/0...
Apr 30 12:07:30 snmpdb systemd[1]: Finished User Runtime Directory /run/user/0.
Apr 30 12:07:30 snmpdb systemd[1]: Starting User Manager for UID 0...
Apr 30 12:07:30 snmpdb systemd[165]: Reached target Paths.
Apr 30 12:07:30 snmpdb systemd[165]: Reached target Timers.
Hipska commented 2 years ago

@jostrasser do you also see a "Reached target Network" in these logs? I ask this because seeing "eth0: DHCPv4 address" between logs of telegraf makes me think telegraf is already started before the network is completely up and running.

@MyaLongmire @reimda Do you have any idea on this ticket that is open since February?

jostrasser commented 2 years ago

@jostrasser do you also see a "Reached target Network" in these logs? I ask this because seeing "eth0: DHCPv4 address" between logs of telegraf makes me think telegraf is already started before the network is completely up and running.

@MyaLongmire @reimda Do you have any idea on this ticket that is open since February?

Yes, I think you are right. Telegraf starts before the network is online... but I have no idea why this occurs on all releases after 1.20.4.

bondskin commented 2 years ago

I just downgraded to 1.20.4 which is also for me the last good version. This made also my issue with one SNMP client go away.

Apr 30 09:40:01 raspberrypidns telegraf[13301]: 2022-04-30T07:40:01Z E! [inputs.snmp] Error in plugin: agent area51.ip: performing get on field sysName: &GoSNMP.Conn is missing. Provide a connection or use Connect() Apr 30 09:40:01 raspberrypidns telegraf[13301]: 2022-04-30T07:40:01Z E! [inputs.snmp] Error in plugin: agent area51.ip: gathering table ifTable: performing bulk walk for field ifDescr: &GoSNMP.Conn is missing. Provide a connection or use Connect() Apr 30 09:40:01 raspberrypidns telegraf[13301]: 2022-04-30T07:40:01Z E! [inputs.snmp] Error in plugin: agent area51.ip: gathering table unifiRadioTable: performing bulk walk for field unifiRadioName: &GoSNMP.Conn is missing. Provide a connection or use Connect() Apr 30 09:40:01 raspberrypidns telegraf[13301]: 2022-04-30T07:40:01Z E! [inputs.snmp] Error in plugin: agent area51.ip: gathering table unifiVapTable: performing bulk walk for field unifiVapName: &GoSNMP.Conn is missing. Provide a connection or use Connect() Apr 30 09:40:01 raspberrypidns telegraf[13301]: 2022-04-30T07:40:01Z E! [inputs.snmp] Error in plugin: agent area51.ip: gathering table unifiIfTable: performing bulk walk for field unifiIfName: &GoSNMP.Conn is missing. Provide a connection or use Connect() Apr 30 09:40:01 raspberrypidns telegraf[13301]: 2022-04-30T07:40:01Z E! [inputs.snmp] Error in plugin: agent area51.ip: gathering table hrProcessorTable: performing bulk walk for field hrProcessorFrwID: &GoSNMP.Conn is missing. Provide a connection or use Connect() Apr 30 09:40:01 raspberrypidns telegraf[13301]: 2022-04-30T07:40:01Z E! [inputs.snmp] Error in plugin: agent area51.ip: gathering table laTable: performing bulk walk for field laNames: &GoSNMP.Conn is missing. Provide a connection or use Connect() Apr 30 09:41:01 raspberrypidns telegraf[13301]: 2022-04-30T07:41:01Z E! [inputs.snmp] Error in plugin: agent area51.ip: performing get on field sysName: &GoSNMP.Conn is missing. Provide a connection or use Connect() Apr 30 09:41:01 raspberrypidns telegraf[13301]: 2022-04-30T07:41:01Z E! [inputs.snmp] Error in p

Hipska commented 2 years ago

@jostrasser I think the following might as well help with your issue: #11042 I think you could try the latest nightly build.

jostrasser commented 2 years ago

Hi @Hipska I can confirm: Version 1.22.4 is fixing the issue! https://github.com/influxdata/telegraf/releases/tag/v1.22.4

Thanks for your help!! 👍

BR/JO!

bondskin commented 2 years ago

The GoSNMP issue does still persist for me

May 19 08:48:00 raspberrypidns telegraf[4503]: 2022-05-19T06:48:00Z E! [inputs.snmp] Error in plugin: agent flash.ip: gathering table ifTable: performing bulk walk for field i fDescr: &GoSNMP.Conn is missing. Provide a connection or use Connect() May 19 08:48:00 raspberrypidns telegraf[4503]: 2022-05-19T06:48:00Z E! [inputs.snmp] Error in plugin: agent flash.ip: gathering table unifiRadioTable: performing bulk walk for field unifiRadioName: &GoSNMP.Conn is missing. Provide a connection or use Connect() May 19 08:48:00 raspberrypidns telegraf[4503]: 2022-05-19T06:48:00Z E! [inputs.snmp] Error in plugin: agent flash.ip: gathering table unifiVapTable: performing bulk walk for f ield unifiVapName: &GoSNMP.Conn is missing. Provide a connection or use Connect() May 19 08:48:00 raspberrypidns telegraf[4503]: 2022-05-19T06:48:00Z E! [inputs.snmp] Error in plugin: agent flash.ip: gathering table unifiIfTable: performing bulk walk for fi eld unifiIfName: &GoSNMP.Conn is missing. Provide a connection or use Connect() May 19 08:48:00 raspberrypidns telegraf[4503]: 2022-05-19T06:48:00Z E! [inputs.snmp] Error in plugin: agent flash.ip: gathering table hrProcessorTable: performing bulk walk fo r field hrProcessorFrwID: &GoSNMP.Conn is missing. Provide a connection or use Connect() May 19 08:48:00 raspberrypidns telegraf[4503]: 2022-05-19T06:48:00Z E! [inputs.snmp] Error in plugin: agent flash.ip: gathering table laTable: performing bulk walk for field l aNames: &GoSNMP.Conn is missing. Provide a connection or use Connect()

Telegraf 1.22.4 (git: HEAD acf67065)

Hipska commented 2 years ago

I'm going to reopen your original issue #10890, as this issue has a different root cause and is actually fixed.