exhuma / puresnmp

Pure Python SNMPv2 Library
MIT License
79 stars 22 forks source link

bulkwalk loops forever if a buggy snmp server loops #94

Open scotloach opened 3 years ago

scotloach commented 3 years ago

Issue Description

I have a device that is probably in some bad state. Doing a bulkwalk on an OID returns the expected varbinds, but then loops back to the second result and continues looping like this forever.

puresnmp only returns the results once, but then it never returns from bulkwalk() since it loops forever in the unfinished_oids loop.

I think what is needed is a check in multiwalk(): if yielded hasn't had anything added to it by the current multiget, then exit the unfinished_oids loop.

exhuma commented 3 years ago

Thanks for reporting this. It shouldn't be too complicated to add a simple loop detection for this case.

exhuma commented 3 years ago

Do you still have the device available? I might need your help with this to reliably reproduce this.

scotloach commented 3 years ago

Yes, I still have it available. Just let me know what I can do to help.

exhuma commented 3 years ago

If you could trigger the error with debug-logging enabled and send the output that would be a huge help.

Something like this:

import puresnmp
import logging
logging.basicConfig(level=logging.DEBUG)

... code-that-triggers-the-loop

Considering that this is an endless loop you will need to CTRL-C this.

This will print out hexdumps of all the packet exchanges between client and server. These hexdumps are what interests me primarily. They will help me dig deeper into the issue and reproduce it in a test-environment.

firstoxe commented 2 years ago

fast hot fix xD

/api/raw.py

    error_oid_loop = 300  
    for requested, retrieved in zip(oids, output):
        if not OID(requested) < retrieved.oid:
            try:
                if error_oid_loop > 0:
                    stringified = unicode(retrieved.oid)  # TODO remove when Py2 is dropped
            except:
                error_oid_loop -= 1
    return output
exhuma commented 2 years ago

The new major release 2.0 has finally hit pypi. Albeit as pre-release for now so you have to opt-in by either running pip install with the --pre flag or using the exact-version with pip install "puresnmp==2.0.0rc1".

If you have the possibility to upgrade, could you give it a try and see how the device behaves? With luck I might be able to iron this out before flagging the release as "final".

Note that this is a fairly big change due to the focus on asyncio. Depending on your project this may be a painful upgrade but I would appreciate the feedback.

Docs (including overview of the upgrade) are available at https://puresnmp.readthedocs.io/en/develop/

black-punkduck commented 11 months ago

Repeating or non-increasing OIDs

In my case this error occurs for the standard IPv6 Mib on several Cisco boxes. This MIB is used like this: if you know an address, you can figure out the snmp-index of the port, which then could be used to decide if one manages the interface.

The error itself is unfortunately not uncommon and I know that problem since years from different MIBs. And even system snmpwalk itself had the problem as well in the beginning. One can still switch the check off (option -Cc)

Since routers and switches have configured ports, they should not be switched off or be rebooted (to avoid outages), that means that the OIDs are created at runtime, so there is always a good chance that something is wrong inside the OS of a router. Classically the error disappears after a reboot.

So first the normal way with system command snmpwalk:

Hint: The coding is a bit odd, so I added the version with numbers only as well. I changed the V6 address in the example for data security reasons ;)

/usr/bin/snmpwalk -v2c -c mycommunity 10.10.0.167 1.3.6.1.2.1.4.34.1.3.4
IP-MIB::ipAddressIfIndex.ipv6z."21:22:23:24:25:26:00:5a:7f:cc:00:00:00:00:00:0b%503316481" = INTEGER: 11
IP-MIB::ipAddressIfIndex.ipv6z."21:22:23:24:25:26:00:5a:7f:cc:00:00:00:00:00:0d%503316481" = INTEGER: 15
IP-MIB::ipAddressIfIndex.ipv6z."fd:00:00:00:00:00:00:00:00:00:00:00:00:51:00:00%503316481" = INTEGER: 14
IP-MIB::ipAddressIfIndex.ipv6z."fe:80:00:00:00:00:00:00:02:42:68:ff:fe:55:ca:30%301989901" = INTEGER: 9
IP-MIB::ipAddressIfIndex.ipv6z."21:22:23:24:25:26:00:5a:7f:cc:00:00:00:00:00:0b%503316481" = INTEGER: 11
Error: OID not increasing: IP-MIB::ipAddressIfIndex.ipv6z."fe:80:00:00:00:00:00:00:02:42:68:ff:fe:55:ca:30%301989901"
 >= IP-MIB::ipAddressIfIndex.ipv6z."21:22:23:24:25:26:00:5a:7f:cc:00:00:00:00:00:0b%503316481"

/usr/bin/snmpwalk -v2c -c mycommunity -On 10.10.0.167 1.3.6.1.2.1.4.34.1.3.4
.1.3.6.1.2.1.4.34.1.3.4.20.33.34.35.36.37.38.0.90.127.204.0.0.0.0.0.11.30.0.0.1 = INTEGER: 11
.1.3.6.1.2.1.4.34.1.3.4.20.33.34.35.36.37.38.0.90.127.204.0.0.0.0.0.13.30.0.0.1 = INTEGER: 15
.1.3.6.1.2.1.4.34.1.3.4.20.253.0.0.0.0.0.0.0.0.0.0.0.0.81.0.0.30.0.0.1 = INTEGER: 14
.1.3.6.1.2.1.4.34.1.3.4.20.254.128.0.0.0.0.0.0.2.66.104.255.254.85.202.48.18.0.0.13 = INTEGER: 9
.1.3.6.1.2.1.4.34.1.3.4.20.33.34.35.36.37.38.0.90.127.204.0.0.0.0.0.11.30.0.0.1 = INTEGER: 11
Error: OID not increasing: .1.3.6.1.2.1.4.34.1.3.4.20.254.128.0.0.0.0.0.0.2.66.104.255.254.85.202.48.18.0.0.13
 >= .1.3.6.1.2.1.4.34.1.3.4.20.33.34.35.36.37.38.0.90.127.204.0.0.0.0.0.11.30.0.0.1

puresnmp would run forever but since the oid is not appended it would not trash the memory at least. My proposal is just to check if the oid already exists:

so in bulkget in site-packages/puresnmp/api/raw.py I propose this:

repeating_out = OrderedDict()  # type: Dict[ObjectIdentifier, Type[Any]]
    for oid, value in repeating_tmp:
        if isinstance(value, EndOfMibView):
            break
        if oid in repeating_out:
            raise SnmpError("Error: OID not increasing: %s" % (oid))
        repeating_out[oid] = value

that would act pretty similar to snmpwalk .. although it is not really testing if the OID increases, of course. For increasing numbers I would need a comparison, I am not sure if I am smart enough to do that with python ... (but in C code it is written like that afaik).

And of course I know that the function is called more than one time, but keeping the last value somewhere in a "static variable", which has to be reset before I do a new command? I dunno, I am not that familiar with your code, but at least wanted to help a bit.

I can do tests of course, since classically I always have some boxes doing weird stuff.

Hopefully I do not have to return to encapsuled snmpwalk commands again, I like the concept of puresnmp and it works great in all other cases.