etingof / pysnmp

Python SNMP library
http://snmplabs.com/pysnmp/
BSD 2-Clause "Simplified" License
576 stars 194 forks source link

Memory leak using AsynCommandGenerator in pysnmp 4.3.2 #8

Closed westofpluto closed 7 years ago

westofpluto commented 8 years ago

We are using v4.3.2 and we are seeing the memory steadily increase when using AsynCommandGenerator. The code is like what is shown below. This is a SNMPGet class that we create and run for each set of OIDs we have. If we have a long running process that does this over and over, that process eventually consumes all the machine memory. We are using the memory_profiler library and @profile decorator as described here: https://pypi.python.org/pypi/memory_profiler

Are we using AsynCommandGenerator incorrectly? How can we make the error growth go away in a long running process?

#
# worker class for snmp information
#
import threading
import sys

from pysnmp.entity.rfc3413.oneliner import cmdgen
from pysnmp.proto import rfc1902
from memory_profiler import profile

class SNMPGet(object):
    def __init__(self, device_id, ip_address, object_identifiers, cookie, callback, error_callback):
        self.device_id = device_id
        self.ip_address = ip_address
        self.object_identifiers = object_identifiers
        self.cookie = cookie
        self.callback = callback
        self.error_callback = error_callback
        self.error_count = 0
        self.lock = threading.Lock()
        self.executing = False
        #
        self.cmd_gen=None
        self.targets=None

    def execute(self):
        if not self.executing:
            # Prevent lots of executions from accruing when errors occur
            self.executing = True
            for tmp_object_identifiers in self.object_identifiers:
                self._internal_execute(tmp_object_identifiers)
            self.executing = False

    @profile
    def _internal_execute(self, tmp_object_identifiers):
        try:
            try:
                del self.cmd_gen
            except:
                pass
            try:
                del self.targets
            except:
                pass

            self.cmd_gen = cmdgen.AsynCommandGenerator()

            self.targets = (
                (cmdgen.CommunityData('public', mpModel=0),
                 cmdgen.UdpTransportTarget((self.ip_address, 161), retries=0),
                 tmp_object_identifiers),)

            for auth_data, transport_target, var_names in self.targets:
                res = self.cmd_gen.getCmd(auth_data, transport_target, var_names, (self._snmp_callback, None))
                del res

            self.cmd_gen.snmpEngine.transportDispatcher.runDispatcher()

        except Exception as exc:
            sys.stderr.write('%s\n' % (exc,))
            self._increment_error_count()
            self.error_callback(self, self.cookie)
        finally:
            try:
                self.cmd_gen.uncfgCmdGen()
            except:
                pass
            try:
                del self.cmd_gen
            except:
                pass
            try:
                del self.targets 
            except:
                pass

    def _snmp_callback(self, send_request_handle, error_indication, error_status, error_index, var_binds, cbCtx):
        if error_indication:
            self._increment_error_count()
            self.error_callback(self, self.cookie)
            return

        if error_status:
            self._increment_error_count()
            self.error_callback(self, self.cookie)
            return

        self.callback(self, var_binds, self.cookie)
        self.reset_error_count()

    def _increment_error_count(self):
        self.lock.acquire()
        try:
            self.error_count += 1
        finally:
            self.lock.release()

    def reset_error_count(self):
        self.lock.acquire()
        try:
            self.error_count = 0
        finally:
            self.lock.release()
etingof commented 8 years ago

Have you been able to figure out, with the memory_profiler library, what objects are leaking?

Do you run your code in multiple threads?

So far I do not see obvious issues with your pysnmp code, however I have a few suggestions:

westofpluto commented 7 years ago

Can you please elaborate on this answer? How exactly would I do "SnmpEngine instance unloading to free up resources"? Do you just mean that I should call

del snmpEngine

after I use it? Or is it more involved than that?

Yes we run the code in multiple threads.

etingof commented 7 years ago

Let me revise my original answer then. Assuming you are calling SNMPGet from a thread and you are not doing SNMP I/O asynchronously, my suggestion is to use something like this:

from pysnmp.hlapi import *
from pysnmp.hlapi.lcd import CommandGeneratorLcdConfigurator

class SNMPGet:
    """Instance of this class must be thread-local, not shared between threads."""
    def __init__(self, *):
        self._snmpEngine = SnmpEngine()

    def _internal_execute(self, tmp_object_identifiers):
        gen =  getCmd(self._snmpEngine,
                      CommunityData('public'),
                      UdpTransportTarget(('demo.snmplabs.com', 161)),
                      ContextData(),
                      *[ObjectType(ObjectIdentity(oid)) for oid in tmp_identifiers])

        errorIndication, errorStatus, errorIndex, varBinds = next(gen)

    def cleanup(self):
        """Running SnmpEngine instance against ever increasing set of targets may grow memory.
            To drop SnmpEngine's caches you can periodically call this method. The backside is
            slight drop in performance as hot caches will have to be repopulated.
        """
        CommandGeneratorLcdConfigurator.unconfigure(self._snmpEngine)

As a side note, if you are fetching many OIDs from many targets, async I/O may give you better performance compared to MT one.

etingof commented 7 years ago

If the leak shows up with the later pysnmp versions -- please reopen this issue. Thank you!