devitocodes / devito

DSL and compiler framework for automated finite-differences and stencil computation
http://www.devitoproject.org
MIT License
562 stars 228 forks source link

Memory consumption gradually grows when Operator called in loop #845

Closed navjotk closed 1 year ago

navjotk commented 5 years ago

In current master, there are a few sources of increased memory consumption when operators are created in a loop through the seismic wrapper classes:

mloubout commented 5 years ago

Is the issue creating an operator or are you creating new solver?

FabioLuporini commented 5 years ago

the issue is creating new symbolic objects

mloubout commented 5 years ago

Didn't realize I haven't posted it there. There is a small issue wit hsympy.solve that I am trying to fix but the bottom line is:

if you call solve(eq), it seems like sympy is creating an extra reference to all the objects in eq so the cache won't clear. One of the (hacky) bypass for now is to directly write the stencil like at:

https://github.com/opesci/devito/blob/935621d5f020b08f3a229a53632b304a9761c8e0/examples/seismic/tti/operators.py#L17

FabioLuporini commented 5 years ago

not sure if related to solve. Maybe it's two different things.

I have a reproducer here

mloubout commented 5 years ago

Isn't this fixed byt the clear_cache fix? I k ow #945 is referenced here but I don't think it's the same issue

navjotk commented 5 years ago

Why is it not the same issue?

mloubout commented 5 years ago

Sorry read through it again and the title/description is quite missleading as it says that doing solver.op() multiple times increase mempry consumption, which is not the case. I would say rename/re-descrivbe it or open a different one.

navjotk commented 5 years ago

it says that doing solver.op() multiple times increase mempry consumption

It doesn't?

mloubout commented 4 years ago

With the new caching infrastructure it doesn't no, that's why I removed all the clear cache and preallocation of u0 in the fwi/rtm notebooks

FabioLuporini commented 4 years ago

I think it is still an issue. Every time new symbolic objects are created, a new class type is dynamically created ; that is what causes the memory increase (not the data), as apparently this dynamically created class cannot be eliminated from the python session.

solver.op() called in a loop thousands of times creates thousand of new (Time)Functions, each of them dynamically creating a new class type, hence increasing memory consumption. Note that the per-iteration increase is totally negligible ; but with O(10**4) iterations, this starts becoming a thing

It is also possible I'm missing something at this point... what I encourage to do is to take my MFE linked above and trying it again

ofmla commented 2 years ago

I would like to make a comment on this issue. Motived by discussion in the thread https://devitocodes.slack.com/archives/C1HSXP6KB/p1643619000223249 I performed several 3D experiments. In a first test (test1.py),


from examples.seismic.acoustic import AcousticWaveSolver
from examples.seismic import demo_model, setup_geometry

def main():
    shape = (151, 151, 151)    # 151x151x151 grid
    spacing = (10., 10., 10.)  # spacing of 10 meters
    origin = (0., 0., 0.)      # origin
    nbl = 0                    # number of pad points
    tn = 1000
    space_order = 8
    nshots = 20

    # Use a defined preset model
    model = demo_model('layers-isotropic', spacing=spacing, space_order=space_order,
                       shape=shape, nbl=nbl)

    geometry = setup_geometry(model, tn)

    solver = AcousticWaveSolver(model, geometry, kernel='OT2',
                                space_order=space_order)

    for i in range(nshots):
        solver.forward()

if __name__ == "__main__":
    main()

I modeled 20 shots in a loop using the forward method without any arguments and could see the memory growing, although at certain intervals of time the memory is released. test1

In a second experiment (test2.py)


from devito import *
from examples.seismic.acoustic import AcousticWaveSolver
from examples.seismic import demo_model, setup_geometry

def main():
    shape = (151, 151, 151)    # 151x151x151 grid
    spacing = (10., 10., 10.)  # spacing of 10 meters
    origin = (0., 0., 0.)      # origin
    nbl = 0                    # number of pad points
    tn = 1000
    space_order = 8
    nshots = 20

    # Use a defined preset model
    model = demo_model('layers-isotropic', spacing=spacing, space_order=space_order,
                       shape=shape, nbl=nbl)

    geometry = setup_geometry(model, tn)

    solver = AcousticWaveSolver(model, geometry, kernel='OT2',
                                space_order=space_order)

    # Define the wavefield with the size of the model and the time dimension
    v = TimeFunction(name="v", grid=model.grid, time_order=2,
                     space_order=space_order)

    for i in range(nshots):
        solver.forward(u=v)

if __name__ == "__main__":
    main()

I passed a TimeFunction created out of the loop as an argument to the forward method and I also observed an increasing in the memory on a smaller scale. test2 Then I verified that simply by using the op_fwd method (which returns the Operator) in the following way: solver.op_fwd().apply(dt=model.critical_dt) the memory is kept constant (test3.py).


from examples.seismic.acoustic import AcousticWaveSolver
from examples.seismic import demo_model, setup_geometry

def main():
    shape = (151, 151, 151)    # 151x151x151 grid
    spacing = (10., 10., 10.)  # spacing of 10 meters
    origin = (0., 0., 0.)      # origin
    nbl = 0                    # number of pad points
    tn = 1000
    space_order = 8
    nshots = 20

    # Use a defined preset model
    model = demo_model('layers-isotropic', spacing=spacing, space_order=space_order,
                       shape=shape, nbl=nbl)

    geometry = setup_geometry(model, tn)

    solver = AcousticWaveSolver(model, geometry, kernel='OT2',
                                space_order=space_order)

    for i in range(nshots):
        solver.op_fwd().apply(dt=model.critical_dt)

if __name__ == "__main__":
    main()

test3 However I was expecting this to be the case in the second experiment, as it is exactly the same way it is done in the forward method (https://github.com/devitocodes/devito/blob/master/examples/seismic/acoustic/wavesolver.py#L114).

You can run the tests with a simple command, i.e mprof run --include-children python test2.py for test 2. The plots are generated with mprof plot mprof.file where mprfo.file is the file created once the last command was executed.

Having thinked a little I came to realize that the creation of (Sparse)TimeFunctions by the forward method was the cause of the growth in memory. In addition to passing the field u to forward, you also should pass rec and src, i.e. running test 2 as below the memory profile is flat


 src = geometry.src
 rec = geometry.rec
 for i in range(nshots):
     solver.forward(src=src, rec=rec, u=v).

At certain point I considered the possibility of the solver (i.e. AcousticWaveSolver) classes were not working as expected, but there are no problems with them. The important take away message is that you must never create TimeFunction, Function and SparseTimeFunction within a loop. I would also say that the forward method has to be used with special care.

ggorman commented 1 year ago

Bug relates to known python bug: https://bugs.python.org/issue9417

ChatGPT summary: Based on the thread from the Python repository, the issue revolves around circular references created when declaring a class, either using the class statement or using type(). The circular reference problem can potentially cause complications with garbage collection, debugging, and performance, especially in certain specialized use cases like high-traffic servers or applications like Django that create temporary classes.

While the problem has been identified and discussed in detail, the consensus among the participants, including some Python core developers, is that fixing it would be complex, possibly leading to a change in public C APIs or introducing slower lookups. Additionally, the idea of adding a new 'unlink' method was not widely accepted.

The key takeaways from the discussion include:

  1. Acknowledgement of Circular References: The circular reference in class creation is acknowledged and understood by core developers. It is seen as a design tradeoff rather than a traditional "bug" in the sense that it is not a discrepancy between documented behavior and actual behavior.

  2. Proposed Solutions and Their Challenges: Various solutions were proposed, such as using weak references or adding a new "unlink" method, but they were met with concerns about performance impact, API changes, and whether they align with Python's design principles.

  3. Potential Workarounds: Some participants suggested the possibility of creating a new metaclass that produces reference-cycle-free classes or adopting practices to manually manage references. However, these were viewed as more experimental and specific to particular use cases rather than suitable for broad adoption.

  4. Issue Status as "Won't Fix": The status of the issue being closed with the resolution "won't fix" indicates that the core developers did not find a satisfactory way to address the problem that would align with Python's broader design goals and performance needs.

  5. Implementation Specificity: The problem seems to be more related to CPython (and possibly PyPy), and the solutions would likely expose implementation details. This also contributes to the resistance to implementing a fix within the standard library.

  6. Historical Context: The thread also makes references to other related issues and discussions, showing that this isn't an isolated incident but part of a more complex ongoing conversation about reference management within Python.

In conclusion, while the circular reference in class creation is recognized, the decision was made not to pursue a fix due to the complexity of the problem, the potential performance impact, and the risk of exposing implementation details. Users experiencing specific issues related to this behavior may need to consider custom solutions tailored to their particular needs and constraints.

ggorman commented 1 year ago

More discussion on Stackoverflow: https://stackoverflow.com/questions/22547233/are-dynamically-created-classes-always-unreachable-for-gc-in-python