nabla-c0d3 / sslyze

Fast and powerful SSL/TLS scanning library.
GNU Affero General Public License v3.0
3.22k stars 445 forks source link

Memory Leak still seems to be present when running in Python #603

Open lcurtis-datto opened 1 year ago

lcurtis-datto commented 1 year ago

Looking at 560, it appeared that the memory leak was resolved. However, I find after running a large number of scans, memory usage continually increases until the Python script is killed by oom-killer. I have tried using multiprocess and running the sslyze scan in a child process, but I keep running into deadlocks when scanning larger lists of hosts.

To Reproduce The condition is reproduce-able by running the great example code in 560:

import os
import gc
import psutil
from sslyze import Scanner, ServerScanRequest, ServerNetworkLocation

def print_memory_used(msg):
    object_count = len(gc.get_objects())
    process = psutil.Process(os.getpid())
    memory_used = process.memory_info().rss
    print(f"{msg}: object count: {object_count}, memory used: {memory_used / 1024**2} MB")

def sslyze_scan(hostname, port):
    results = list()
    request = ServerScanRequest(ServerNetworkLocation(hostname=hostname, port=port))
    scanner = Scanner()
    scanner.queue_scans([request])
    results = list(scanner.get_results())

for i in range(1,5):
    print_memory_used(f"before run {i}")
    sslyze_scan("mozilla.com", 443)
    print_memory_used(f"after run {i}")

Expected behavior Memory usage should reset on each run, but output shows:

before run 1: object count: 35980, memory used: 37.78515625 MB
after run 1: object count: 45321, memory used: 59.171875 MB
before run 2: object count: 45321, memory used: 59.171875 MB
after run 2: object count: 52064, memory used: 69.640625 MB
before run 3: object count: 52064, memory used: 69.640625 MB
after run 3: object count: 52101, memory used: 79.33984375 MB
before run 4: object count: 52101, memory used: 79.85546875 MB
after run 4: object count: 45080, memory used: 86.80078125 MB

If I do the same thing using multiprocessing, the memory is released.

before run 1: object count: 36584, memory used: 23.13671875 MB
after run 1: object count: 45749, memory used: 51.92578125 MB
before run 2: object count: 36597, memory used: 23.13671875 MB
after run 2: object count: 45759, memory used: 51.9296875 MB
before run 3: object count: 36609, memory used: 23.13671875 MB
after run 3: object count: 45768, memory used: 52.19140625 MB
before run 4: object count: 36621, memory used: 23.13671875 MB
after run 4: object count: 45777, memory used: 52.00390625 MB

Python environment (please complete the following information):

Additional context I have tried using multiprocessing on main internal code, but run into deadlocks when scanning larger groups of hosts:

FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY

I am continuing to investigate other alternatives to avoid increasing memory usage on subsequent scans. Any assistance or advice is greatly appreciated.

yaroslav-dudar commented 4 months ago

Confirm, seems like leak is in nassl library

morrissimo commented 2 months ago

I believe I'm seeing this behavior as well