espressomd / espresso

The ESPResSo package
https://espressomd.org
GNU General Public License v3.0
225 stars 183 forks source link

Online visualization memory leak #3530

Closed jngrad closed 4 years ago

jngrad commented 4 years ago

There is a memory leak when running a simulation with the OpenGL visualizer. The leak rate is around 12 MiB/min for a polymer (visualization_bonded.py) and 0.3 MiB/min for a Lennard-Jones liquid (visualization_ljliquid.py). It does not change when doubling or halving the number of particles in the system. The leak persists even if the simulation freezes, e.g. when the for loop in the integration thread stops. Not sure if the bug comes from the espresso code or from the OpenGL driver. Issue reported by the mailing list. Confirmed on the current python branch (cd66da057d) and 4.1.2 release.

Here is a shell script to measure memory usage of a simulation as a function of time. The monitoring loop stops when the visualizer window is closed by the user.

./pypresso ../samples/visualization_bonded.py --opengl &
pid=${!}
while [ $? == 0 ]; do
  sleep 10
  ps -p ${pid} -o etimes,pmem,vsz --no-headers >> ram.tsv
done

Data processing with matplotlib:

import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import savgol_filter
from psutil import virtual_memory

# three columns: time in seconds, physical memory usage in percentage with 0.1%
# resolution (requires a linear Savitzky-Golay filter), virtual memory usage in kiB
data = np.loadtxt('ram.tsv')
time = data[:,0] / 60.  # runtime in minutes
pmem = data[:,1] * virtual_memory().total / 100. /1e6  # physical memory in MiB
vmem = data[:,2] / 1e3  # virtual memory in MiB

print('memory leak rate: {:.1f} MiB / min'.format(
    (pmem[-1] - pmem[0]) / (time[-1] - time[0])))

plt.plot(time, savgol_filter(pmem - pmem[0], 61, 1), label='Physical memory')
plt.plot(time, vmem - vmem[0], label='Virtual memory')
plt.xlabel('Time (min)')
plt.ylabel('Memory usage (MiB)')
plt.title('Memory leak after OpenGL initialization')
plt.legend()
plt.show()

The polymer has a linear trend, the LJ liquid has steps: ram-bonds-visu ram-lj-visu

jngrad commented 4 years ago

The behavior persists when using visualizer.run(0), so the bug must be contained in the cython file visualization_opengl.pyx. The memory leak rate of a polymer simulation when bonds are not drawn with open draw_bonds=False is identical to that of a LJ liquid: 0.3 MiB/min leak rate and same memory usage curve.

There seems to be a missing memory deallocation in function draw_cylinder(). Adding OpenGL.GLU.gluDeleteQuadric(quadric) doesn't change the graphical output but reduces the memory leak rate down to 0.3 MiB/min. The memory deallocation is also missing in a few shapes (SpheroCylinder, SimplePore, SlitPore), causing a 3 MiB/min leak for an array of 100 spherocylinders. This was easy to fix. After that, the spherocyliner array memory usage grows by 1.5 MiB/min for the first 4 minutes, then becomes constant for 10 min, decreases by half rapidly, and remains constant. This doesn't look like a memory leak anymore.

I couldn't find any other GLU memory allocation with new allocation, and all memory allocations of a matrix on the OpenGL transformation matrix stack are followed by a memory deallocation. I don't know what causes the 0.3 MiB/min leak rate for spheres, but it's somewhat reasonable, e.g. it leaks 100 MiB every 6 hours of visualization. There is no memory leak when the system is empty.