home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.78k stars 30.04k forks source link

Devices stop loading with to many SNMP sensors #31179

Closed jriker1 closed 4 years ago

jriker1 commented 4 years ago

The problem

When I add to many SNMP sensors various things fail to load or even say they can't connect to various internal IP addresses. Take most or all of the SNMP entries out and the problems go away.

Environment

On 0.104.3 with Ubuntu in a VMWare Workstation virtual environment

Problem-relevant configuration.yaml

If both of thes yaml files are in here it errors. If I take one out it errors less. If I take both out all the errors/warning go away.

sensor_dell2130cn.yaml :
- platform: snmp
  name: Dell 2130CN Cyan
  host: 192.168.0.4
  baseoid: 1.3.6.1.2.1.43.11.1.1.9.1.1
  accept_errors: true
  scan_interval: 21600
  unit_of_measurement: '%'
  value_template: "{{ ((value | float / 2500)*100) | int }}"

- platform: snmp
  name: Dell 2130CN Magenta
  host: 192.168.0.4
  baseoid: 1.3.6.1.2.1.43.11.1.1.9.1.2
  accept_errors: true
  scan_interval: 21600
  unit_of_measurement: '%'
  value_template: "{{ ((value | float / 2500)*100) | int }}"

- platform: snmp
  name: Dell 2130CN Yellow
  host: 192.168.0.4
  baseoid: 1.3.6.1.2.1.43.11.1.1.9.1.3
  accept_errors: true
  scan_interval: 21600
  unit_of_measurement: '%'
  value_template: "{{ ((value | float / 2500)*100) | int }}"

- platform: snmp
  name: Dell 2130CN Black
  host: 192.168.0.4
  baseoid: 1.3.6.1.2.1.43.11.1.1.9.1.4
  accept_errors: true
  scan_interval: 21600
  unit_of_measurement: '%'
  value_template: "{{ ((value | float / 2500)*100) | int }}"

- platform: snmp
  name: Dell 2130CN Status
  host: 192.168.0.4
  baseoid: 1.3.6.1.2.1.43.17.6.1.5.1.1
  accept_errors: true
  scan_interval: 1000

- platform: snmp
  name: Dell 2130CN Status 2
  host: 192.168.0.4
  baseoid: 1.3.6.1.2.1.43.17.6.1.5.1.2
  accept_errors: true
  scan_interval: 1000

- platform: snmp
  name: Dell 2130CN Ready
  host: 192.168.0.4
  baseoid: 1.3.6.1.2.1.43.16.5.1.2.1.1
  accept_errors: true
  scan_interval: 1000

- platform: snmp
  name: Dell 2130CN Name
  host: 192.168.0.4
  baseoid: 1.3.6.1.2.1.25.3.2.1.3.1
  accept_errors: true
  scan_interval: 86400
  value_template: "{{ value.split(';')[0] }}"

- platform: snmp
  name: Dell 2130CN How Long Up
  host: 192.168.0.4
  baseoid: 1.3.6.1.2.1.1.3.0
  accept_errors: true
  scan_interval: 3600
  unit_of_measurement: 'Days'
  value_template: "{{ (value | int / 8640000) | int }}"

- platform: snmp
  name: Dell 2130CN Pages Printed
  host: 192.168.0.4
  baseoid: 1.3.6.1.2.1.43.10.2.1.4.1.1
  accept_errors: true
  scan_interval: 3600

  #.1.3.6.1.2.1.1.6.0 Location
  #.1.3.6.1.2.1.1.4.0 Contact
  #.1.3.6.1.2.1.43.11.1.1.8.1.4 Black maximum level
  #.1.3.6.1.2.1.43.11.1.1.8.1.1 Cyan Maximum level
  #.1.3.6.1.2.1.43.11.1.1.8.1.2 Magenta Maximum level
  #.1.3.6.1.2.1.43.11.1.1.8.1.3 Yellow Maximum Level

sensor_canonmf8580cdw.yaml :
- platform: snmp
  name: Canon MF850CDW Black
  host: 192.168.0.6
  baseoid: 1.3.6.1.2.1.43.11.1.1.9.1.1
  accept_errors: true
  scan_interval: 21600
  unit_of_measurement: '%'

- platform: snmp
  name: Canon MF8580CDW Cyan
  host: 192.168.0.6
  baseoid: 1.3.6.1.2.1.43.11.1.1.9.1.2
  accept_errors: true
  scan_interval: 21600
  unit_of_measurement: '%'

- platform: snmp
  name: Canon MF8580CDW Magenta
  host: 192.168.0.6
  baseoid: 1.3.6.1.2.1.43.11.1.1.9.1.3
  accept_errors: true
  scan_interval: 21600
  unit_of_measurement: '%'

- platform: snmp
  name: Canon MF8580CDW Yellow
  host: 192.168.0.6
  baseoid: 1.3.6.1.2.1.43.11.1.1.9.1.4
  accept_errors: true
  scan_interval: 21600
  unit_of_measurement: '%'

- platform: snmp
  name: Canon MF8580CDW Status
  host: 192.168.0.6
  baseoid: 1.3.6.1.2.1.25.3.5.1.2.1
  accept_errors: true
  scan_interval: 1000
  value_template: >-
    {% if value|int(base=16) is equalto 0 %}
      OK
    {% else %}
      Failed
    {% endif %}

- platform: snmp
  name: Canon MF8580CDW Name
  host: 192.168.0.6
  baseoid: 1.3.6.1.2.1.25.3.2.1.3.1
  accept_errors: true
  scan_interval: 86400

- platform: snmp
  name: Canon MF8580CDW How Long Up
  host: 192.168.0.6
  baseoid: 1.3.6.1.2.1.1.3.0
  accept_errors: true
  scan_interval: 3600
  unit_of_measurement: 'Days'
  value_template: "{{ ((value | float / 8640000) | float) | round(1) }}"

- platform: snmp
  name: Canon MF8580CDW Pages Printed
  host: 192.168.0.6
  baseoid: 1.3.6.1.4.1.1602.1.11.2.1.1.3.1
  accept_errors: true
  scan_interval: 3600

- platform: snmp
  name: Canon MF8580CDW B/W Pages Printed
  host: 192.168.0.6
  baseoid: 1.3.6.1.4.1.1602.1.11.2.1.1.3.3
  accept_errors: true
  scan_interval: 3600

- platform: snmp
  name: Canon MF8580CDW Color Pages Printed
  host: 192.168.0.6
  baseoid: 1.3.6.1.4.1.1602.1.11.2.1.1.3.5
  accept_errors: true
  scan_interval: 3600

I do also have a couple other yaml files with SNMP calls to APC units but those weren't erroring.

Traceback/Error logs

In my logs there are the usual you are using a custom integration and a reference to an unfinished session but that's about it.

If I have one of the two above yaml files I get

020-01-25 10:47:03 WARNING (MainThread) [homeassistant.components.weather] Setup of weather platform met is taking over 10 seconds.

If I activate both I get in the logs

2020-01-26 07:35:30 WARNING (MainThread) [homeassistant.loader] You are using a custom integration for hacs which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you do experience issues with Home Assistant.
2020-01-26 07:35:30 WARNING (MainThread) [homeassistant.loader] You are using a custom integration for alexa_media which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you do experience issues with Home Assistant.
2020-01-26 07:35:31 WARNING (Recorder) [homeassistant.components.recorder] Ended unfinished session (id=411 from 2020-01-26 03:27:51)
2020-01-26 07:35:37 WARNING (MainThread) [homeassistant.loader] You are using a custom integration for xfinity which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you do experience issues with Home Assistant.
2020-01-26 07:35:46 WARNING (MainThread) [homeassistant.setup] Setup of group is taking over 10 seconds.
2020-01-26 07:35:46 WARNING (MainThread) [homeassistant.setup] Setup of input_select is taking over 10 seconds.
2020-01-26 07:35:46 WARNING (MainThread) [homeassistant.setup] Setup of person is taking over 10 seconds.
2020-01-26 07:35:46 WARNING (MainThread) [homeassistant.loader] You are using a custom integration for composite which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you do experience issues with Home Assistant.
2020-01-26 07:35:47 WARNING (MainThread) [homeassistant.components.weather] Setup of weather platform met is taking over 10 seconds.
2020-01-26 07:35:47 ERROR (MainThread) [homeassistant.components.unifi] Error connecting to the UniFi controller at 192.168.0.1
2020-01-26 07:35:47 WARNING (MainThread) [homeassistant.config_entries] Config entry for unifi not ready yet. Retrying in 5 seconds.
2020-01-26 07:35:47 WARNING (MainThread) [homeassistant.components.sensor] Setup of sensor platform snmp is taking over 10 seconds.
2020-01-26 07:35:47 WARNING (MainThread) [homeassistant.components.sensor] Setup of sensor platform snmp is taking over 10 seconds.
2020-01-26 07:35:47 WARNING (MainThread) [homeassistant.components.sensor] Setup of sensor platform snmp is taking over 10 seconds.
2020-01-26 07:35:48 ERROR (MainThread) [metno] https://aa015h6buqvih86i1.api.met.no/weatherapi/locationforecast/1.9/ returned
2020-01-26 07:35:48 ERROR (MainThread) [homeassistant.components.met.weather] Retrying in 17 minutes
2020-01-26 07:36:02 WARNING (MainThread) [homeassistant.components.sensor] Setup of sensor platform alexa_media is taking over 10 seconds.
2020-01-26 07:36:02 WARNING (SyncWorker_6) [custom_components.composite.device_tracker] device_tracker.steve_s_iphone_unifi unsupported source_type: None
2020-01-26 07:36:02 WARNING (SyncWorker_6) [custom_components.composite.device_tracker] device_tracker.barb_s_iphone_unifi unsupported source_type: None

After this my weather isn't showing and other various issues. If I shutdown and reload my UniFi controller then the logs show things being added. So seems like an initial load issue.

Additional information

This is something new I added since 0.104 so not sure if related or not as no previous state.

elupus commented 4 years ago

The snmp implementation does seem to run a separate snmp client for each sensor you add. So it's likely very inefficient. It probably should keep a global SnmpEngine atleast.

stale[bot] commented 4 years ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue now has been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

tomasz-soltysik commented 2 years ago

I have issue that sounds similar, but after investigation it looks like an issue with snmpwalk. When issuing query for OID 1.3.6.1.2.1.43.11.1.1.8.0 I get 4, ending with 1, 2, 3, 4. If I add 1, 2 or 3 to OID (e.g. 1.3.6.1.2.1.43.11.1.1.8.0.3) it returns value. If I add 4 (1.3.6.1.2.1.43.11.1.1.8.0.4), nothing is returned. Similar issue when OID has only one child. If query for parent, I get result. If I query for exact OID, no result. Looks like some indexing issue in snmpwalk. In one case xyz.1 was listed wehn queried xyz, when queried xyz.1 nothing returned, but xyz.0 returned value which was on list for xyz.1

I couldn't find any resource on the Internet about this issue. HP specific bug in SNMP implementation?

quadhammer commented 2 years ago

I believe I'm having this problem too. It seems very random as to which sensors get loaded, but there are always some that don't. Rebooting sometimes helps increase the number but sensors are still missing.

I have 14 sensors: 2 UPSes with 7 sensors each.