PyAV-Org / PyAV

Pythonic bindings for FFmpeg's libraries.
https://pyav.basswood-io.com/
BSD 3-Clause "New" or "Revised" License
2.38k stars 353 forks source link

Memory leak in if mp4 container not explicitely closed #1117

Open hmaarrfk opened 1 year ago

hmaarrfk commented 1 year ago

Overview

I think there is a memory leak that occurs if you don't explicitly close a container or stream.

I'm still trying to drill the problem down, but I think I have a minimum reproducing example that I think is worthwhile to share at this stage.

It seems that __dealloc__ isn't called as expected maybe??? https://github.com/PyAV-Org/PyAV/blob/main/av/container/input.pyx#L88

import av
import numpy as np
from tqdm import tqdm
# from tqdm.notebook import tqdm

data = np.ones((256, 256, 3), dtype='uint8')

filename = f"video.mp4"
container = av.open(filename, mode="w")
stream = container.add_stream('libopenh264', rate=30)
stream.height = data.shape[0]
stream.width = data.shape[1]
stream.pix_fmt = "yuv420p"

for j in tqdm(range(100), leave=False):
    frame = av.VideoFrame.from_ndarray(data)
    for packet in stream.encode(frame):
        container.mux(packet)

stream.close()
container.close()

import psutil
import os
virtual_memory = []
memory_used = []

for j in tqdm(range(10000)):
    filename = f"video_{i}.mp4"
    container = av.open(filename, mode="r")
    stream = container.streams.video[0]
    packet_generator = container.demux(stream)
    for packet in packet_generator:
        stream.decode(packet)
    process = psutil.Process(os.getpid())
    memory_info = process.memory_info()
    memory_used.append(memory_info.rss)
    virtual_memory.append(memory_info.vms)
    # Explicitly call close to reduce the memory leak.
    # stream.close()
    # container.close()

import numpy as np
from matplotlib import pyplot as plt
m_used = np.asarray(memory_used)
v_used = np.asarray(virtual_memory)
plt.loglog(np.arange(1, len(m_used)), m_used[1:] - m_used[0], label="Memory Used")
plt.ylabel("Memory Usage (Bytes)")
plt.xlabel("Iteration (count)")
# plt.loglog(np.arange(1, len(m_used) + 1), v_used, label="Virtual Used")
plt.legend()

image

Expected behavior

That the memory be cleared. If I add the close calls to the loop I get. image

Versions

My PyAV builds include the patch in https://github.com/PyAV-Org/PyAV/pull/1061 those in master. I really don't think it is related, but if you want, I'll run my plotting code again.

Research

I have done the following:

Additional context

Maybe related to:

github-actions[bot] commented 11 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

hmaarrfk commented 11 months ago

No stale

bfreskura commented 11 months ago

hi @hmaarrfk

Have you found any other solution besides explicitly closing containers and streams?

This is my current code that encodes and decodes from the list of numpy array images (imgs_padded). It still produces memory leaks:

 with io.BytesIO() as buf:
            with av.open(buf, "w", format=container_string) as container:
                stream = container.add_stream(codec, rate=rate, options=options)
                stream.height = imgs[0].shape[0]
                stream.width = imgs[0].shape[1]
                stream.pix_fmt = pixel_fmt

                for img in imgs_padded:
                    frame = av.VideoFrame.from_ndarray(img, format="rgb24")
                    frame.pict_type = "NONE"
                    for packet in stream.encode(frame):
                        container.mux(packet)

                # Flush stream
                for packet in stream.encode():
                    container.mux(packet)

                stream.close()

            outputs = []
            with av.open(buf, "r", format=container_string) as video:
                for i, frame in enumerate(video.decode(video=0)):
                    if sequence_length <= i < sequence_length * 2:
                        outputs.append(frame.to_rgb().to_ndarray().astype(np.uint8))

                    if i >= sequence_length * 2:
                        break

                video.streams.video[0].close()
hmaarrfk commented 11 months ago

no, i just explicitly call close.

bfreskura commented 11 months ago

Ok, thanks.

I later found out the memory leak only occurs if I use the libx265 codec. Everything is ok when using libx264, mpeg2video, mpeg1video, libvpx-vp9.

meakbiyik commented 10 months ago

I can also reproduce this, without stream.close() the memory leaks.

It seems to happen when I process the frames in a different process, and stream.close causes everything to hang if I use more than 1 processes. Multiple bugs bundled into one 😅

RoyaltyLJW commented 9 months ago

hi, @meakbiyik Have you found any solution to deal with the problem? I find that if i break before all the frame extracted, stream.close causes everything to hang

meakbiyik commented 9 months ago

Hey @RoyaltyLJW, I still use stream.close(), and I was able to fix the deadlock issue by setting the environment variable PYAV_LOGGING=off. Here's that bug: https://github.com/PyAV-Org/PyAV/issues/751

RoyaltyLJW commented 9 months ago

@meakbiyik Thanks a lot. It fix my deadlock issue

github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

hmaarrfk commented 5 months ago

Not stale