Closed SuvarnaMeenakshi closed 2 years ago
@abdosi fyi
seen on other platforms also
Get object list with multi-asic VS:
{"list": 550, "dict": 3186, "tuple": 2608, "coroutine": 46, "frame": 45, "Future": 22, "TimerHandle": 21, "Context": 92, "FutureIter": 22, "TaskWakeupMethWrapper": 22, "scputimes": 12, "svmem": 1, "wrapper_descriptor": 1184, "builtin_function_or_method": 1588, "method_descriptor": 947, "classmethod_descriptor": 27, "getset_descriptor": 925, "member_descriptor": 382, "weakref": 1435, "staticmethod": 180, "module": 276, "SimpleNamespace": 1, "function": 6527, "type": 831, "ModuleSpec": 273, "property": 547, "classmethod": 193, "cell": 529, "frozenset": 101, "FileFinder": 31, "set": 169, "SourceFileLoader": 212, "CodecInfo": 2, "ABCMeta": 78, "TextIOWrapper": 3, "IncrementalDecoder": 1, "IncrementalEncoder": 2, "BufferedReader": 1, "FileIO": 3, "BufferedWriter": 2, "_Environ": 2, "Quitter": 2, "_Printer": 3, "_Helper": 1, "Repr": 1, "method": 163, "itemgetter": 37, "mappingproxy": 1, "WeakKeyDictionary": 2, "count": 3, "EnumMeta": 54, "DynamicClassAttribute": 2, "OrderedDict": 60, "_NamedIntConstant": 74, "RegexFlag": 15, "_lru_cache_wrapper": 7, "JSONEncoder": 1, "JSONDecoder": 1, "Scanner": 1, "ExtensionFileLoader": 17, "_TemplateMetaclass": 1, "method-wrapper": 2, "WeakSet": 4, "_MainThread": 1, "Event": 6, "Formatter": 2, "WeakValueDictionary": 2, "_StderrHandler": 1, "Condition": 2, "PercentStyle": 2, "deque": 8, "RootLogger": 1, "Manager": 1, "AddressFamily": 32, "SocketKind": 7, "MsgFlag": 14, "AddressInfo": 7, "Logger": 5, "NullHandler": 1, "Signals": 32, "Handlers": 2, "Sigmasks": 3, "_Feature": 10, "_SSLMethod": 6, "Options": 12, "AlertDescription": 27, "SSLErrorNumber": 9, "VerifyFlags": 5, "VerifyMode": 3, "TLSVersion": 7, "Purpose": 2, "_ASN1Object": 2, "StrictVersion": 1, "UUID": 4, "SafeUUID": 3, "PlaceHolder": 1, "_SendfileMode": 3, "_ParameterKind": 5, "_RunningLoop": 1, "Random": 1, "IPv4Address": 31, "IPv4Network": 19, "IPv6Network": 28, "ValueType": 13, "PduTypes": 19, "IPv6Address": 43, "RegisteredPDU": 13, "Reason": 6, "Errors": 32, "_WindowsFlavour": 1, "_PosixFlavour": 1, "_NormalAccessor": 1, "attrgetter": 2, "_NSEnum": 1, "methodcaller": 1, "partial": 7, "YAMLObjectMetaclass": 1, "SysLogHandler": 1, "socket": 4, "Load": 1, "Store": 1, "Del": 1, "AugLoad": 1, "AugStore": 1, "Param": 1, "And": 1, "Or": 1, "Add": 1, "Sub": 1, "Mult": 1, "MatMult": 1, "Div": 1, "Mod": 1, "Pow": 1, "LShift": 1, "RShift": 1, "BitOr": 1, "BitXor": 1, "BitAnd": 1, "FloorDiv": 1, "Invert": 1, "Not": 1, "UAdd": 1, "USub": 1, "Eq": 1, "NotEq": 1, "Lt": 1, "LtE": 1, "Gt": 1, "GtE": 1, "Is": 1, "IsNot": 1, "In": 1, "NotIn": 1, "_UnixDefaultEventLoopPolicy": 1, "hamt": 1, "hamt_bitmap_node": 1, "MIBMeta": 20, "_Local": 1, "LLDPRemoteTables": 12, "LLDPLocalChassis": 6, "LLDPLocalSystemDataUpdater": 1, "MIBEntry": 12, "LLDPRemTableUpdater": 1, "SubtreeMIBEntry": 92, "_UnixSelectorEventLoop": 1, "LocPortUpdater": 1, "LLDPLocManAddrUpdater": 1, "SonicV2Connector": 93, "EpollSelector": 1, "_SelectorMapping": 1, "SelectorKey": 2, "Handle": 4, "DbTables": 24, "IfTypes": 3, "ArpUpdater": 1, "NextHopUpdater": 1, "OverlayAdpaterMIBEntry": 24, "sysNameUpdater": 1, "PhysicalClass": 12, "FanInfoDB": 5, "FanDrawerInfoDB": 4, "PhysicalRelationInfoDB": 4, "PsuInfoDB": 8, "XcvrInfoDB": 6, "ThermalInfoDB": 2, "OidMIBEntry": 24, "PhysicalTableMIBUpdater": 1, "DbTables32": 4, "DbTables64": 8, "InterfaceMIBUpdater": 1, "RedisOidTreeUpdater": 2, "ThermalCacheUpdater": 1, "FanCacheUpdater": 1, "FanDrawerCacheUpdater": 1, "PsuCacheUpdater": 1, "XcvrCacheUpdater": 1, "EntitySensorDataScale": 17, "EntitySensorStatus": 3, "EntitySensorValueRange": 2, "PhysicalSensorTableMIBUpdater": 1, "RouteUpdater": 1, "FdbUpdater": 1, "NicDuplex": 3, "BatteryTime": 2, "_WrapNumbers": 1, "IOPriority": 4, "LLDPRemManAddrUpdater": 1, "InterfacesUpdater": 1, "EntitySensorDataType": 12, "Negsignal": 32, "Connections": 1, "SystemUtilizationHandler": 1, "BgpSessionUpdater": 1, "Agent": 1, "MIBTable": 1, "SocketManager": 1, "Task": 22, "_GatheringFuture": 1, "ThreadPoolExecutor": 1, "SimpleQueue": 1, "Thread": 1, "DirectionTypes": 2, "QueueStatUpdater": 1, "CHASSISInfoDB": 1, "PSUInfoDB": 2, "PowerStatusHandler": 1, "PubSub": 35, "Codec": 2, "ConfigDBConnector": 4, "AgentX": 1, "_SelectorSocketTransport": 1, "KeyedRef": 1, "PfcUpdater": 1, "PfcPrioUpdater": 1}
ANd there are 1000+ ''anon_inode:[eventpoll]' and 600+ 'socket' opened.
For epoll is related with PubSub, for socket it related with DBConnector, however the python object count is normal.
Also following test code also show similer issue: from swsscommon.swsscommon import SonicDBConfig SonicDBConfig.load_sonic_global_db_config() SonicDBConfig.get_ns_list() from swsscommon.swsscommon import SonicV2Connector db = SonicV2Connector(use_unix_socket_path=True, namespace='') db.connect("CONFIG_DB") db.get_redis_client("CONFIG_DB") db.connect("CONFIG_DB") db.get_redis_client("CONFIG_DB") client = db.get_redis_client("CONFIG_DB") client.pubsub() <== run multiple times the epool keeps increase but PubSub object destoryed
So, consider about the C++ object not destoryed by SWIG generated code. will add log to confirm this.
This issue is because some method which return new object need declare as %newobject in SWIG: https://www.swig.org/Doc4.0/SWIGDocumentation.html#Customization_ownership
Verifyed on local devbox, will submit PR and validate if the fix can work with chassis
The issue can be reproduce with following code: from swsscommon.swsscommon import SonicDBConfig from swsscommon.swsscommon import SonicV2Connector import gc
SonicDBConfig.load_sonic_global_db_config() SonicDBConfig.get_ns_list() db = SonicV2Connector(use_unix_socket_path=True, namespace='') db.connect("CONFIG_DB") db.get_redis_client("CONFIG_DB") client = db.get_redis_client("CONFIG_DB") client.pubsub() client.pubsub() client.pubsub() client.newConnector(0) client.newConnector(0) client.newConnector(0)
gc.collect()
And there will be 3 epoll and 4 socket opened in python. After fix this issue, ther will only 1 socket opened.
Found another memory leak issue in swsscommon:
void ConfigDBConnector_Native::db_connect(string db_name, bool wait_for_init, bool retry_on) { m_db_name = db_name; m_key_separator = m_table_name_separator = get_db_separator(db_name); SonicV2Connector_Native::connect(m_db_name, retry_on);
if (wait_for_init)
{
auto& client = get_redis_client(m_db_name);
auto pubsub = client.pubsub(); <== this is a pointer but not delete.
https://github.com/Azure/sonic-swss-common/pull/655 checked with the sonic-buildimage multi-asic VS image with your latest swss-common PR, I see that the fd count is not going above this range; the issue that we were seeing in SNMP is fixed with the fix
root@vlab-08:/# ls -l /proc/24/fd | wc -l 650 root@vlab-08:/# ls -l /proc/24/fd | wc -l 650 root@vlab-08:/# ls -l /proc/24/fd | wc -l 626 root@vlab-08:/# ls -l /proc/24/fd | wc -l 626
Description
Steps to reproduce the issue:
Describe the results you received:
Error logs seen
Describe the results you expected:
Should not see any errors in syslog
Output of
show version
:With master branch.
Output of
show techsupport
:Additional information you deem important (e.g. issue happens only occasionally):