threeML / astromodels

Spatial and spectral models for astrophysics
BSD 3-Clause "New" or "Revised" License
44 stars 45 forks source link

Memory leaks when building models from custom templates #212

Open salaza82 opened 1 month ago

salaza82 commented 1 month ago

Spectral model classes inheriting the FunctionMeta class AND loading data leak memory. Memory leaks immediately on calling the constructor and when class object goes out of scope.

Class defintion does the following

from astromodels.functions.function import Function1D, FunctionMeta
import numpy as np

class CustomSpec(Function1D, metaclass=FunctionMeta):

    def _setup(self):
        # Does nothing

    def _load_spec_from_params(self):
        self._data = np.load(<path_to_data_file>, allow_pickle=True).item()

    def set_params(self, params):
        self_attributes = params

        self._load_spec_from_params()

The following will leak slowly

import psutil as ps
from threeML import *
# Custom Spectral class library
import CustomSpec

p = ps.Process()

for i in range(500):
    spectrum = CustomSpec() # instantiate spectral model

    if i%50 == 0: # Print memory use every 50 iters
        print(f'Real Usage: {p.memory_info().rss * 1e-6:4.1f} MB')

And the following leaks substantially.

spectrum = CustomSpec()

for i in range(500):

    # Define Spectral model Parameters
    spectrum.set_params("your params here")

    # Instantiate spatial template
    myDwarf = PointSource('name', 0.0, 30.0, spectral_shape=spectrum)
    model = Model(myDwarf)

    if i%50 == 0 or i == 499: # Print memory use every 50 steps
        print(f'Real Usage: {p.memory_info().rss * 1e-6:4.1f} MB')

In the second block, the jump in memory use is exactly the size of the data table loaded in the custom spectral class.

omodei commented 1 month ago

Hi Dan, I modified your code to make it working. The function implementation needs to have the docstring the units setter and the evaluate function. This is the implementation (I simply read in a big array, to test memory leaks):

import numpy as np
import psutil as ps
from threeML import *

a = np.ones((1000,1000))
np.savez('123.npz', a=a)

class CustomSpec(Function1D, metaclass=FunctionMeta):
    r"""
    description :
        CustomSpec
    latex :  $\delta$
    parameters :

        K :
            desc : Normalization
            initial value : 1.0
            is_normalization : True
            transformation : log10
            min : 1e-10
            max: 1e4
            delta: 0.1
    """
    def _setup(self):
        pass
        # Does nothing

    def _load_spec_from_params(self):
        self._data = np.load('123.npz', allow_pickle=True)

    def set_params(self, params):
        self._load_spec_from_params()

    def evaluate(self,x, K):
        return K * x * self._data[0]

    def _set_units(self, x_unit, y_unit):
        # The normalization has the same units as the y
        self.K.unit = y_unit
        pass

Then, I ran your simple tests:

p = ps.Process()
N=1000
for i in range(N):
    spectrum = CustomSpec() # instantiate spectral model
    if i%50 == 0: # Print memory use every 50 iters
        print(f'Real Usage: {p.memory_info().rss * 1e-6:4.1f} MB')

and the real usage went from 307.4 MB to 307.8 MB. The second test:

spectrum = CustomSpec()

for i in range(N):

    # Define Spectral model Parameters
    spectrum.set_params("your params here")

    # Instantiate spatial template
    myDwarf = PointSource('myPointSource', 0.0, 30.0, spectral_shape=spectrum)
    model = Model(myDwarf)
    #del model # <--- This didn't make any difference!
    if i%50 == 0 or i == N-1: # Print memory use every 50 steps
        print(f'{i:d} Real Usage: {p.memory_info().rss * 1e-6:4.1f} MB')

The memory went from 307.9 MB to 308.0 MB, so, not a big difference. Can you double check?

salaza82 commented 1 month ago

Hi Nicola,

Sorry I pasted working code in my analysis pipeline instead of the failure mode. In the second block the code should look like

for i in range(N):

spectrum = CustomSpec() # instantiate this in the for loop instead

# Define Spectral model Parameters
spectrum.set_params("your params here")

# Instantiate spatial template
myDwarf = PointSource('myPointSource', 0.0, 30.0, spectral_shape=spectrum)
model = Model(myDwarf)
#del model # <--- This didn't make any difference!
if i%50 == 0 or i == N-1: # Print memory use every 50 steps
    print(f'{i:d} Real Usage: {p.memory_info().rss * 1e-6:4.1f} MB')

Best, Dan S.-G.


From: Nicola Omodei @.> Sent: Tuesday, September 17, 2024 3:11 PM To: threeML/astromodels @.> Cc: Salazar-Gallegos, Dan @.>; Author @.> Subject: Re: [threeML/astromodels] Memory leaks when building models from custom templates (Issue #212)

Hi Dan, I modified your code to make it working. The function implementation needs to have the docstring the units setter and the evaluate function. This is the implementation (I simply read in a big array, to test memory leaks):

import numpy as np import psutil as ps from threeML import *

a = np.ones((1000,1000)) np.savez('123.npz', a=a)

class CustomSpec(Function1D, metaclass=FunctionMeta): r""" description : CustomSpec latex : $\delta$ parameters :

    K :
        desc : Normalization
        initial value : 1.0
        is_normalization : True
        transformation : log10
        min : 1e-10
        max: 1e4
        delta: 0.1
"""
def _setup(self):
    pass
    # Does nothing

def _load_spec_from_params(self):
    self._data = np.load('123.npz', allow_pickle=True)

def set_params(self, params):
    self._load_spec_from_params()

def evaluate(self,x, K):
    return K * x * self._data[0]

def _set_units(self, x_unit, y_unit):
    # The normalization has the same units as the y
    self.K.unit = y_unit
    pass

Then, I ran your simple tests:

p = ps.Process() N=1000 for i in range(N): spectrum = CustomSpec() # instantiate spectral model if i%50 == 0: # Print memory use every 50 iters print(f'Real Usage: {p.memory_info().rss * 1e-6:4.1f} MB')

and the real usage went from 307.4 MB to 307.8 MB. The second test:

spectrum = CustomSpec()

for i in range(N):

# Define Spectral model Parameters
spectrum.set_params("your params here")

# Instantiate spatial template
myDwarf = PointSource('myPointSource', 0.0, 30.0, spectral_shape=spectrum)
model = Model(myDwarf)
#del model # <--- This didn't make any difference!
if i%50 == 0 or i == N-1: # Print memory use every 50 steps
    print(f'{i:d} Real Usage: {p.memory_info().rss * 1e-6:4.1f} MB')

The memory went from 307.9 MB to 308.0 MB, so, not a big difference. Can you double check?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/threeML/astromodels/issues/212*issuecomment-2356697862__;Iw!!HXCxUKc!29OJ8xwA3iYEiPx2rDPyS15GKdgIl5IpcKdBuZu-wykio2IuDPrztjTIs76XE8fGmkb9q43qC0ufiBxwPJ-Guml8Xw$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AMN36CSB4XMKPIVESSDJH73ZXB5G5AVCNFSM6AAAAABOLV6F4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJWGY4TOOBWGI__;!!HXCxUKc!29OJ8xwA3iYEiPx2rDPyS15GKdgIl5IpcKdBuZu-wykio2IuDPrztjTIs76XE8fGmkb9q43qC0ufiBxwPJ9lxdQk6Q$. You are receiving this because you authored the thread.

omodei commented 1 month ago

You can reduce memory leaks by explicitly deleting the object in the reverse order you create them. Inside the loop: del model del myDwarf del spectrum

for i in range(N):

    spectrum = CustomSpec() # instantiate this in the for loop instead

    # Define Spectral model Parameters
    spectrum.set_params("your params here")

    # Instantiate spatial template
    myDwarf = PointSource('myPointSource', 0.0, 30.0, spectral_shape=spectrum)
    model = Model(myDwarf)
    del model
    del myDwarf
    del spectrum
    if i%50 == 0 or i == N-1: # Print memory use every 50 steps
        print(f'{i:d} Real Usage: {p.memory_info().rss * 1e-6:4.1f} MB')
salaza82 commented 1 month ago

Nice! I'll work that into my analysis chain. Can this also be added into the API documentation so that future users don't run into this issue? Or maybe this can be inserted into the destructors of the model, PointSource, and Spectral super classes

Thanks, Dan S.-G.


From: Nicola Omodei @.> Sent: Tuesday, September 17, 2024 4:23 PM To: threeML/astromodels @.> Cc: Salazar-Gallegos, Dan @.>; Author @.> Subject: Re: [threeML/astromodels] Memory leaks when building models from custom templates (Issue #212)

You can reduce memory leaks by explicitly deleting the object in the reverse order you create them. Inside the loop: del model del myDwarf del spectrum

for i in range(N):

spectrum = CustomSpec() # instantiate this in the for loop instead

# Define Spectral model Parameters
spectrum.set_params("your params here")

# Instantiate spatial template
myDwarf = PointSource('myPointSource', 0.0, 30.0, spectral_shape=spectrum)
model = Model(myDwarf)
del model
del myDwarf
del spectrum
if i%50 == 0 or i == N-1: # Print memory use every 50 steps
    print(f'{i:d} Real Usage: {p.memory_info().rss * 1e-6:4.1f} MB')

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/threeML/astromodels/issues/212*issuecomment-2356852356__;Iw!!HXCxUKc!z7hPgO9lOkkVAj2YfzJN4J4_kohR1645IKxebCwRRa9i5cUcR-pnoqdMmINASaDn9k1lgYRevWJEojkcvttTpSmmtw$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AMN36CRLC3P2H2ASDUXJQZDZXCFTLAVCNFSM6AAAAABOLV6F4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJWHA2TEMZVGY__;!!HXCxUKc!z7hPgO9lOkkVAj2YfzJN4J4_kohR1645IKxebCwRRa9i5cUcR-pnoqdMmINASaDn9k1lgYRevWJEojkcvtuS_E0ZpQ$. You are receiving this because you authored the thread.Message ID: @.***>