sonic-net / SONiC

Landing page for Software for Open Networking in the Cloud (SONiC) - https://sonic-net.github.io/SONiC/
2.22k stars 1.12k forks source link

SNMP can't get COUNTERS_DB in startup #446

Open jerry-chang3300 opened 5 years ago

jerry-chang3300 commented 5 years ago

After startup, we get below message:

Nov  4 03:03:05.454320 as7326-56x ERR snmp#snmp-subagent [ax_interface] ERROR: MIBUpdater.start() caught an unexpected exception during update_data()#012Traceback (most recent call last):#012  File "/usr/local/lib/python3.6/dist-packages/ax_interface/mib.py", line 46, in start#012    self.update_data()#012  File "/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/ietf/rfc2863.py", line 98, in update_data#012    for sai_id in self.if_id_map}#012  File "/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/ietf/rfc2863.py", line 98, in <dictcomp>#012    for sai_id in self.if_id_map}#012  File "/usr/local/lib/python3.6/dist-packages/swsssdk/interface.py", line 38, in wrapped#012    ret_data = f(inst, db_name, *args, **kwargs)#012  File "/usr/local/lib/python3.6/dist-packages/swsssdk/interface.py", line 324, in get_all#012    raise UnavailableDataError(message, _hash)#012swsssdk.exceptions.UnavailableDataError: Key 'b'COUNTERS:oid:0x1000000000002'' unavailable in database 'COUNTERS_DB'

Nov  4 03:04:05.789213 as7326-56x ERR snmp#snmp-subagent [ax_interface] ERROR: MIBUpdater.start() caught an unexpected exception during update_data()#012Traceback (most recent call last):#012  File "/usr/local/lib/python3.6/dist-packages/ax_interface/mib.py", line 40, in start#012    self.reinit_data()#012  File "/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/vendor/cisco/ciscoPfcExtMIB.py", line 40, in reinit_data#012    self.update_data()#012  File "/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/vendor/cisco/ciscoPfcExtMIB.py", line 49, in update_data#012    for sai_id in self.if_id_map}#012  File "/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/vendor/cisco/ciscoPfcExtMIB.py", line 49, in <dictcomp>#012    for sai_id in self.if_id_map}#012  File "/usr/local/lib/python3.6/dist-packages/swsssdk/interface.py", line 38, in wrapped#012    ret_data = f(inst, db_name, *args, **kwargs)#012  File "/usr/local/lib/python3.6/dist-packages/swsssdk/interface.py", line 324, in get_all#012    raise UnavailableDataError(message, _hash)#012swsssdk.exceptions.UnavailableDataError: Key 'b'COUNTERS:oid:0x1000000000002'' unavailable in database 'COUNTERS_DB'

It seems that SNMP tries to get COUNTERS_DB with unavailable key. Does anyone have idea about how to fix this issue?

raphaelt-nvidia commented 3 years ago

Having just joined the group, I don't know how to fix it, but I can provide a way, possibly one of many, to reproduce it. Change the topology, e.g. by running this in sonic-mgmt docker:

./testbed-cli.sh remove-topo -t0 vault ./testbed-cli.sh remove-topo -t1 vault ./testbed-cli.sh remove-topo -t1-lag vault ./testbed-cli.sh remove-topo -ptf32 vault ./testbed-cli.sh add-topo -t0 vault ./testbed-cli.sh gen-mg -t0 lab vault ./testbed-cli.sh deploy-mg -t0 lab vault

Then: config reload -y on DUT. The errors appear in the log after reload. Speculation: The contents of the COUNTERS_DB name mapping tables change when the topology is changed. Is this the cause of the errors, or is it that a DB lookup is occurring before DB is ready, and the problem is just that we need to access the DB too early because of the change?