lextudio / pysnmp

Python SNMP library
https://www.pysnmp.com/pysnmp/
BSD 2-Clause "Simplified" License
78 stars 21 forks source link

[Bug]: bulkWalkCmd introduces loop when host unreachable #66

Closed GameMaster47 closed 2 months ago

GameMaster47 commented 5 months ago

Expected behavior

When trying to iterate over the generator returned by bulkWalkCmd, we'd expect to see an exception raised in case the target host is unreachable or not responding to the getBulk request.

Actual behavior

Instead we see a loop occuring inside the generator, where a getBulk message is repeatedly being sent out.

Detailed steps

We first found this issue on our application code base, but this can be replicated with the sample code provided in https://docs.lextudio.com/pysnmp/docs/hlapi/asyncio/manager/cmdgen/bulkwalkcmd since the demo.pysnmp.com host is unreachable.

Python package information

6.1.2

Operating system information

Ubuntu 22.04.4 LTS

Python information

3.10.12

(Optional) Contents of your test script

from pysnmp.hlapi import *
from pysnmp import debug

# use specific flags or 'all' for full debugging
debug.setLogger(debug.Debug('all'))
g = bulkWalkCmd(SnmpEngine(),
   CommunityData('public'),
   UdpTransportTarget(('demo.pysnmp.com', 161)),
   ContextData(),
   0, 25,
   ObjectType(ObjectIdentity('SNMPv2-MIB', 'sysDescr')))
next(g)

Relevant log output

2024-04-30 12:24:27,460 pysnmp: StatusInformation: {'errorIndication': RequestTimedOut('No SNMP response received before timeout')}
2024-04-30 12:24:27,460 pysnmp: processResponsePdu: sendPduHandle 4922693, statusInformation {'errorIndication': RequestTimedOut('No SNMP response received before timeout')}
2024-04-30 12:24:27,461 pysnmp: sendPdu: securityName s2108015491253068459, PDU
GetBulkRequestPDU:
 request-id=16078184
 non-repeaters=0
 max-repetitions=25
 variable-bindings=VarBindList:
  VarBind:
   name=1.3.6.1.2.1.1.1
   =_BindValue:
    unSpecified=

2024-04-30 12:24:27,461 pysnmp: sendPdu: current time 20 ticks, one tick is 0.5 seconds
2024-04-30 12:24:27,461 pysnmp: sendPdu: new sendPduHandle 4922694, timeout 2.0 ticks, cbFun <bound method CommandGenerator.processResponsePdu of <pysnmp.entity.rfc3413.cmdgen.BulkCommandGenerator object at 0x7609e98268c0>>
2024-04-30 12:24:27,461 pysnmp: prepareOutgoingMessage: PDU request-id 16078184 replaced with unique ID 16169135
2024-04-30 12:24:27,461 pysnmp: prepareOutgoingMessage: using contextEngineId <SnmpEngineID value object, tagSet <TagSet object, tags 0:0:4>, subtypeSpec <ConstraintsIntersection object, consts <ValueSizeConstraint object, consts 0, 65535>, <ValueSizeConstraint object, consts 5, 32>>, encoding iso-8859-1, payload [0x80004fb8056465...732d766d38998ec0]> contextName b''
2024-04-30 12:24:27,461 pysnmp: generateRequestMsg: using community <OctetString value object, tagSet <TagSet object, tags 0:0:4>, subtypeSpec <ConstraintsIntersection object, consts <ValueSizeConstraint object, consts 0, 65535>>, encoding iso-8859-1, payload [public]> for securityModel <SnmpSecurityModel value object, tagSet <TagSet object, tags 0:0:2>, subtypeSpec <ConstraintsIntersection object, consts <ValueRangeConstraint object, consts -2147483648, 2147483647>, <ValueRangeConstraint object, consts 0, 2147483647>, <ValueRangeConstraint object, consts 1, 2147483647>>, payload [2]>, securityName <SnmpAdminString value object, tagSet <TagSet object, tags 0:0:4>, subtypeSpec <ConstraintsIntersection object, consts <ValueSizeConstraint object, consts 0, 65535>, <ValueSizeConstraint object, consts 0, 255>>, encoding utf-8, payload [s2108015491253068459]>, contextEngineId <SnmpEngineID value object, tagSet <TagSet object, tags 0:0:4>, subtypeSpec <ConstraintsIntersection object, consts <ValueSizeConstraint object, consts 0, 65535>, <ValueSizeConstraint object, consts 5, 32>>, encoding iso-8859-1, payload [0x80004fb8056465...732d766d38998ec0]> contextName b''
2024-04-30 12:24:27,462 pysnmp: generateRequestMsg: Message:
 version=1
 community=public
 data=PDUs:
  get-bulk-request=GetBulkRequestPDU:
   request-id=16169135
   non-repeaters=0
   max-repetitions=25
   variable-bindings=VarBindList:
    VarBind:
     name=1.3.6.1.2.1.1.1
     =_BindValue:
      unSpecified=

2024-04-30 12:24:27,462 pysnmp: sendPdu: MP succeeded
2024-04-30 12:24:27,462 pysnmp: sendMessage: sending transportAddress ('20.163.207.223', 161) outgoingMessage 
00000: 30 28 02 01 01 04 06 70 75 62 6C 69 63 A5 1B 02 
00016: 04 00 F6 B8 AF 02 01 00 02 01 19 30 0D 30 0B 06 
00032: 07 2B 06 01 02 01 01 01 05 00
^CTraceback (most recent call last):
  File "/home/rmarques/devops/lldp-parser/app/from pysnmp.py", line 12, in <module>
    next(g)
  File "/home/rmarques/.local/share/virtualenvs/lldp-parser-63Lywrb8/lib/python3.10/site-packages/pysnmp/hlapi/asyncio/sync/cmdgen.py", line 679, in bulkWalkCmd
    result = loop.run_until_complete(run())
  File "/usr/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
    self.run_forever()
  File "/usr/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
    self._run_once()
  File "/usr/lib/python3.10/asyncio/base_events.py", line 1871, in _run_once
    event_list = self._selector.select(timeout)
  File "/usr/lib/python3.10/selectors.py", line 469, in select
    fd_event_list = self._selector.poll(timeout, max_ev)
KeyboardInterrupt
GameMaster47 commented 5 months ago

After some debugging on pysnmp code, I could see that the issue is that the once inside ./hlapi/asyncio/cmdgen.py It seems that nothing is really handling the requestTimedOut exception:

lines 958 - 966

        if errorIndication:
            yield (
                errorIndication,
                errorStatus,
                errorIndex,
                varBindTable and varBindTable[0] or [],
            )
            if errorIndication != errind.requestTimedOut:
                return

After this nothing in the code caughts it and since this is in a run_until_complete loop I guess it causes the loop

I've changed this to

lines 958 - 966

        if errorIndication:
            yield (
                errorIndication,
                errorStatus,
                errorIndex,
                varBindTable and varBindTable[0] or [],
            )
            #if errorIndication != errind.requestTimedOut:
            return

just to test it out and the loop stops, but not sure what's the impact this might have to the rest of the code....

lextm commented 2 months ago

Since we retired the sync API in release 6.2, this issue will not be worked on.