Memory is not released after grabbing is completed

Hi,

When we grab images using the Python API and save the grabbed images to disk, memory is not released after completing all tasks. The memory consumption is proportional to the number of images we are grabbing. We have tried saving the images with Pillow, manually calling the garbage collector, and explicitly deleting the image_array, but none of these approaches had an effect on memory consumption.

What could be the reason for this memory leak?

Example output:

Initial RAM usage: 352.15 MB
Final RAM usage: 3151.29 MB
Total RAM increase: 2799.14 MB
RAM usage after final GC: 3151.29 MB
Final RAM change: 2799.14 MB

import argparse
from pypylon import pylon
import time
import os
import psutil
import queue
import threading
import gc
import cv2
import copy

def parse_arguments():
    parser = argparse.ArgumentParser(description="Capture images from a camera and monitor RAM usage.")
    parser.add_argument('-w', '--width', type=int, default=1936, help='Image width (pixels)')
    parser.add_argument('-H', '--height', type=int, default=1464, help='Image height (pixels)')
    parser.add_argument('-n', '--num_images', type=int, default=2000, help='Number of images to take')
    parser.add_argument('-o', '--output_path', type=str, default='/home/ras/code/test/out/', help='Output directory for images')
    return parser.parse_args()

def setup_camera(width, height):
    cam = pylon.InstantCamera(pylon.TlFactory.GetInstance().CreateFirstDevice())
    cam.Open()
    cam.Height.SetValue(height)
    cam.Width.SetValue(width)
    cam.AcquisitionFrameRateEnable.SetValue(True)
    cam.AcquisitionFrameRate.SetValue(300.0)
    return cam

def get_ram_usage():
    process = psutil.Process(os.getpid())
    return process.memory_info().rss / 1024 / 1024  # in MB

def capture_images(cam, image_queue, num_images):
    for i in range(num_images):
        with cam.RetrieveResult(100) as grab:
            if not grab.GrabSucceeded():
                continue

            capture_time = time.time()
            image_array = copy.deepcopy(grab.Array)
            filename = f'img{i}.tiff'
            image_queue.put((image_array, filename, capture_time, i+1))
            grab.Release()
    # Signal the save thread that we're done capturing
    image_queue.put(None)

def save_images(image_queue, output_path):
    initial_ram = get_ram_usage()
    print(f"Initial RAM usage: {initial_ram:.2f} MB")

    while True:
        item = image_queue.get()
        if item is None:
            break

        image_array, filename, capture_time, image_number = item
        cv2.imwrite(os.path.join(output_path, filename), image_array)
        del image_array

    gc.collect()

    final_ram = get_ram_usage()
    print(f"Final RAM usage: {final_ram:.2f} MB")
    print(f"Total RAM increase: {final_ram - initial_ram:.2f} MB")

    gc.collect()
    time.sleep(1)  # Give the system a moment to potentially release memory
    very_final_ram = get_ram_usage()
    print(f"RAM usage after final GC: {very_final_ram:.2f} MB")
    print(f"Final RAM change: {very_final_ram - initial_ram:.2f} MB")

def main():
    args = parse_arguments()

    os.makedirs(args.output_path, exist_ok=True)

    cam = setup_camera(args.width, args.height)
    cam.StartGrabbing(pylon.GrabStrategy_LatestImageOnly)

    image_queue = queue.Queue()

    try:
        save_thread = threading.Thread(target=save_images, args=(image_queue, args.output_path))
        capture_thread = threading.Thread(target=capture_images, args=(cam, image_queue, args.num_images))

        save_thread.start()
        capture_thread.start()

        capture_thread.join()
        save_thread.join()

    finally:
        cam.StopGrabbing()
        cam.Close()

main()

Is your camera operational in Basler pylon viewer on your platform

Yes

Hardware setup & camera model(s) used

CPU architecture: x86_64 Operating System: Ubuntu 22.04 RAM: 128GB

Runtime information:

python: 3.10.14 (main, May  6 2024, 19:42:50) [GCC 11.2.0]
platform: linux/x86_64/6.8.0-40-generic
pypylon: 4.0.0 / 8.0.0.10

would you try putting the pylon grab result into the queue instead of doing a deep copy?

copy.deepcopy(grab.Array)

this line is not part of pypylon. We do not know if the memory leak is coming from the copy function.

would you try putting the pylon grab result into the queue instead of doing a deep copy?

copy.deepcopy(grab.Array)

this line is not part of pypylon. We do not know if the memory leak is coming from the copy function.

It doesn't change. The initial implementation was as you described. We tried creating a deep copy of the array in case something was preventing the release of memory allocated for the grab.

OK, I will check and get back to you

It seems to be that the memory leak was caused by opencv.

Please do 2 Tests for me.

1) remove this line and run the code. cv2.imwrite(os.path.join(output_path, filename), image_array) 2) In the line where you use pylon array, simply use a dummy array and pass it to save it. image_array = np.zeros((1080, 1920), dtype=np.uint8)

let me know the result of these 2 test,please

It seems to be that the memory leak was caused by opencv.

Please do 2 Tests for me.

remove this line and run the code. cv2.imwrite(os.path.join(output_path, filename), image_array)

In the line where you use pylon array, simply use a dummy array and pass it to save it. image_array = np.zeros((1080, 1920), dtype=np.uint8)

let me know the result of these 2 test,please

We don't think is is caused by opencv because memory leak exist even if we use Pillow as we have written in our first message. But here is the test results.

When we remove cv2.imwrite the output was like that:

Initial RAM usage: 311.91 MB
Final RAM usage: 311.87 MB
Total RAM increase: -0.04 MB
RAM usage after final GC: 284.97 MB

For the second suggesstion we have tried two things. Firstly, we did not put exact image_array to the queue and created dummy array before calling cv2.imwrite

def capture_images(cam, image_queue, num_images):
    for i in range(num_images):
        with cam.RetrieveResult(100) as grab:
            if not grab.GrabSucceeded():
                continue

            capture_time = time.time()
            # image_array = copy.deepcopy(grab.Array)
            filename = f'img{i}.tiff'
            image_queue.put((None, filename, capture_time, i+1))
            grab.Release()
    # Signal the save thread that we're done capturing
    image_queue.put(None)

def save_images(image_queue, output_path):
    initial_ram = get_ram_usage()
    print(f"Initial RAM usage: {initial_ram:.2f} MB")

    while True:
        item = image_queue.get()
        if item is None:
            break

        image_array, filename, capture_time, image_number = item
        image_array = np.zeros((1080, 1920), dtype=np.uint8)
        cv2.imwrite(os.path.join(output_path, filename), image_array)

        # Explicitly delete large objects
        del image_array

And the output:

Initial RAM usage: 311.63 MB
Final RAM usage: 314.75 MB
Total RAM increase: 3.12 MB
RAM usage after final GC: 287.88 MB

Secondly, we put the image_array to the queue as usual and created dummy array before calling cv2.imwrite.

def capture_images(cam, image_queue, num_images):
    for i in range(num_images):
        with cam.RetrieveResult(100) as grab:
            if not grab.GrabSucceeded():
                continue

            capture_time = time.time()
            image_array = copy.deepcopy(grab.Array)
            filename = f'img{i}.tiff'
            image_queue.put((image_array, filename, capture_time, i+1))
            grab.Release()
    # Signal the save thread that we're done capturing
    image_queue.put(None)

def save_images(image_queue, output_path):
    initial_ram = get_ram_usage()
    print(f"Initial RAM usage: {initial_ram:.2f} MB")

    while True:
        item = image_queue.get()
        if item is None:
            break

        image_array, filename, capture_time, image_number = item
        image_array = np.zeros((1080, 1920), dtype=np.uint8)
        cv2.imwrite(os.path.join(output_path, filename), image_array)

        # Explicitly delete large objects
        del image_array

And the output didn't change

Initial RAM usage: 311.70 MB
Final RAM usage: 2490.59 MB
Total RAM increase: 2178.89 MB
RAM usage after final GC: 2463.78 MB

When we remove cv2.imwrite the output was like that:

Initial RAM usage: 311.91 MB Final RAM usage: 311.87 MB Total RAM increase: -0.04 MB RAM usage after final GC: 284.97 MB

This seems to be OK. am I correct? can you check if the writing finishes the work, if the queue contains any object?

When we remove cv2.imwrite the output was like that:

Initial RAM usage: 311.91 MB Final RAM usage: 311.87 MB Total RAM increase: -0.04 MB RAM usage after final GC: 284.97 MB

This seems to be OK. am I correct? can you check if the writing finishes the work, if the queue contains any object?

Yes, that is the expected memory usage. We have made some changes to the queue consumption logic. The system will now wait 5 seconds before stopping the consumer thread in case of an image might have been grabbed but not yet added to the queue. But it didn't fix the issue.

while True:
        try:
            item = image_queue.get(block=True, timeout=5)
        except queue.Empty:
            break

        if not item: continue

        image_array, filename, capture_time, image_number = item
        cv2.imwrite(os.path.join(output_path, filename), image_array)

        # Explicitly delete large objects
        del image_array

And logged the queue size at the end of the program.

Initial RAM usage: 312.27 MB
Final RAM usage: 3048.97 MB
Total RAM increase: 2736.70 MB
RAM usage after final GC: 3022.13 MB
Queue size: 0

would you test this code please?

import argparse
from pypylon import pylon
import time
import os
import psutil
import queue
import threading
import gc
import cv2
import copy
import numpy as np

def parse_arguments():
    parser = argparse.ArgumentParser(description="Capture images from a camera and monitor RAM usage.")
    parser.add_argument('-w', '--width', type=int, default=1920, help='Image width (pixels)')
    parser.add_argument('-H', '--height', type=int, default=1200, help='Image height (pixels)')
    parser.add_argument('-n', '--num_images', type=int, default=2000, help='Number of images to take')
    parser.add_argument('-o', '--output_path', type=str, default='/home/support/out/', help='Output directory for images')
    return parser.parse_args()

def setup_camera(width, height):
    cam = pylon.InstantCamera(pylon.TlFactory.GetInstance().CreateFirstDevice())
    print("Using device ", cam.GetDeviceInfo().GetModelName())
    cam.Open()
    cam.Height.SetValue(height)
    cam.Width.SetValue(width)
    cam.AcquisitionFrameRateEnable.SetValue(True)
    cam.AcquisitionFrameRate.SetValue(5)
    return cam

def get_ram_usage():
    process = psutil.Process(os.getpid())
    return process.memory_info().rss / 1024 / 1024  # in MB

def capture_images(cam, image_queue, num_images):
    converter = pylon.ImageFormatConverter()

    # converting to opencv bgr format
    converter.OutputPixelFormat = pylon.PixelType_BGR8packed
    converter.OutputBitAlignment = pylon.OutputBitAlignment_MsbAligned

    for i in range(num_images):
        with cam.RetrieveResult(1000) as grabResult:
            if not grabResult.GrabSucceeded():
                continue

            capture_time = time.time()
            img = pylon.PylonImage()
            img.AttachGrabResultBuffer(grabResult)
            filename = f'img{i}.png'
            image_queue.put((img, filename, capture_time, i+1))
            grabResult.Release()
    # Signal the save thread that we're done capturing
    image_queue.put(None)

def save_images(image_queue, output_path):
    initial_ram = get_ram_usage()
    print(f"Initial RAM usage: {initial_ram:.2f} MB")

    while True:
        item = image_queue.get()
        if item is None:
            break

        img, filename, capture_time, image_number = item
        img.Save(pylon.ImageFileFormat_Png, os.path.join(output_path, filename))
        #cv2.imwrite(os.path.join(output_path, filename), img)
        img.Release()
        #img = None
        #del img

    gc.collect()

    final_ram = get_ram_usage()
    print(f"Final RAM usage: {final_ram:.2f} MB")
    print(f"Total RAM increase: {final_ram - initial_ram:.2f} MB")

    gc.collect()
    time.sleep(1)  # Give the system a moment to potentially release memory
    very_final_ram = get_ram_usage()
    print(f"RAM usage after final GC: {very_final_ram:.2f} MB")
    print(f"Final RAM change: {very_final_ram - initial_ram:.2f} MB")

def main():
    args = parse_arguments()

    os.makedirs(args.output_path, exist_ok=True)

    cam = setup_camera(args.width, args.height)
    cam.StartGrabbing(pylon.GrabStrategy_LatestImageOnly)

    image_queue = queue.Queue()

    try:
        save_thread = threading.Thread(target=save_images, args=(image_queue, args.output_path))
        capture_thread = threading.Thread(target=capture_images, args=(cam, image_queue, args.num_images))

        save_thread.start()
        capture_thread.start()

        capture_thread.join()
        save_thread.join()

    finally:
        cam.StopGrabbing()
        cam.Close()

main()

If you would replace the line cv2.imwrite with PIL.Image.Save() you will not see the issue. So you can use pylon image of PIL image as a workaround.

Maybe Opencv uses cache memory to speed up the writing process.

Please test both of these alternative libs and give the feedback

If you would replace the line cv2.imwrite with PIL.Image.Save() you will not see the issue. So you can use pylon image of PIL image as a workaround.

Maybe Opencv uses cache memory to speed up the writing process.

Please test both of these alternative libs and give the feedback

Using Pillow doesn't solve the problem as we mentioned earlier. We have replaced cv2.imwrite with:

Image.fromarray(image_array).save(os.path.join(output_path, filename))

and the output was:

Initial RAM usage: 314.98 MB
Final RAM usage: 696.16 MB
Total RAM increase: 381.19 MB
RAM usage after final GC: 669.29 MB
Queue size: 0

So, the memory consumption is lower but it is still not released after grabbing is completed. When we tried with img.AttachGrabResultBuffer(grabResult) according to your example, we didn't observe a memory leak in terms of RSS and VSZ. However, we observed unusual memory consumption reporting when we ran the sudo systemctl status command to see the status of the service (in the production deployment, it will run as a systemd service).

● data-acq.service - Data acquisition service for pad system
     Loaded: loaded (/etc/systemd/system/data-acq.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2024-09-13 14:36:01 +03; 2min 3s ago
   Main PID: 179295 (python)
      Tasks: 29 (limit: 154020)
     Memory: 16.3G
        CPU: 5min 51.595s
     CGroup: /system.slice/data-acq.service
             └─179295 python main.py

We prepared a simple script to monitor memory consumption of our service. Grabbing started at 2024-09-13 14:36:12 and stopped at 2024-09-13 14:36:52 The output:

Monitoring RAM usage of data-acq.service every 5 seconds. Press Ctrl+C to stop.
Timestamp - RSS (ps command), VSZ (ps command), Systemd reported memory
2024-09-13 14:36:07 - RSS: 287.47 MB, VSZ: 2258.85 MB, Systemd: 178.1 MB
2024-09-13 14:36:12 - RSS: 314.97 MB, VSZ: 2717.91 MB, Systemd: 205.8 MB
2024-09-13 14:36:17 - RSS: 315.35 MB, VSZ: 2717.91 MB, Systemd: 2048.0 MB
2024-09-13 14:36:22 - RSS: 315.47 MB, VSZ: 2717.91 MB, Systemd: 4198.4 MB
2024-09-13 14:36:27 - RSS: 315.60 MB, VSZ: 2717.91 MB, Systemd: 6144.0 MB
2024-09-13 14:36:32 - RSS: 315.60 MB, VSZ: 2717.91 MB, Systemd: 8192.0 MB
2024-09-13 14:36:37 - RSS: 315.60 MB, VSZ: 2717.91 MB, Systemd: 10240.0 MB
2024-09-13 14:36:42 - RSS: 315.60 MB, VSZ: 2717.91 MB, Systemd: 12288.0 MB
2024-09-13 14:36:47 - RSS: 315.60 MB, VSZ: 2717.91 MB, Systemd: 14540.8 MB
2024-09-13 14:36:52 - RSS: 315.60 MB, VSZ: 2717.91 MB, Systemd: 16793.6 MB
2024-09-13 14:36:57 - RSS: 315.60 MB, VSZ: 2717.91 MB, Systemd: 16793.6 MB
2024-09-13 14:37:02 - RSS: 315.60 MB, VSZ: 2717.91 MB, Systemd: 16793.6 MB

What could be the reason for the significant difference between RSS and systemd memory usage reporting?

Here's the script we used to monitor memory usage:

#!/bin/bash

SERVICE_NAME="data-acq.service"
INTERVAL=5  # Time in seconds between each check

# Function to convert KB to MB
kb_to_mb() {
    echo "scale=2; $1 / 1024" | bc
}

# Function to convert human-readable sizes to MB
to_mb() {
    local size=$1
    case ${size: -1} in
        K|k) echo "scale=2; ${size%?} / 1024" | bc ;;
        M|m) echo "${size%?}" ;;
        G|g) echo "scale=2; ${size%?} * 1024" | bc ;;
        *) echo "scale=2; $size / 1024 / 1024" | bc ;;  # Assume bytes if no unit
    esac
}

# Function to get memory usage
get_memory_usage() {
    PID=$(systemctl show -p MainPID -q $SERVICE_NAME | cut -d= -f2)

    if [ -z "$PID" ] || [ "$PID" -eq 0 ]; then
        echo "Service $SERVICE_NAME is not running."
        return 1
    fi

    # Get RSS and VSZ in kilobytes
    read RSS VSZ <<< $(ps -o rss=,vsz= -p $PID)

    # Convert to megabytes
    RSS_MB=$(kb_to_mb $RSS)
    VSZ_MB=$(kb_to_mb $VSZ)

    # Get systemd reported memory
    SYSTEMD_MEM=$(systemctl status $SERVICE_NAME | grep Memory | awk '{print $2}')
    SYSTEMD_MB=$(to_mb $SYSTEMD_MEM)

    echo "$(date '+%Y-%m-%d %H:%M:%S') - RSS: ${RSS_MB} MB, VSZ: ${VSZ_MB} MB, Systemd: ${SYSTEMD_MB} MB"
}

echo "Monitoring RAM usage of $SERVICE_NAME every $INTERVAL seconds. Press Ctrl+C to stop."
echo "Timestamp - RSS (ps command), VSZ (ps command), Systemd reported memory"

while true; do
    get_memory_usage
    sleep $INTERVAL
done

basler / pypylon