Closed navjotk closed 1 year ago
Is the issue creating an operator or are you creating new solver?
the issue is creating new symbolic objects
Didn't realize I haven't posted it there. There is a small issue wit hsympy.solve
that I am trying to fix but the bottom line is:
if you call solve(eq)
, it seems like sympy is creating an extra reference to all the objects in eq
so the cache won't clear. One of the (hacky) bypass for now is to directly write the stencil like at:
not sure if related to solve
. Maybe it's two different things.
I have a reproducer here
Isn't this fixed byt the clear_cache fix? I k ow #945 is referenced here but I don't think it's the same issue
Why is it not the same issue?
Sorry read through it again and the title/description is quite missleading as it says that doing solver.op()
multiple times increase mempry consumption, which is not the case.
I would say rename/re-descrivbe it or open a different one.
it says that doing solver.op() multiple times increase mempry consumption
It doesn't?
With the new caching infrastructure it doesn't no, that's why I removed all the clear cache and preallocation of u0
in the fwi/rtm notebooks
I think it is still an issue. Every time new symbolic objects are created, a new class type is dynamically created ; that is what causes the memory increase (not the data), as apparently this dynamically created class cannot be eliminated from the python session.
solver.op() called in a loop thousands of times creates thousand of new (Time)Functions
, each of them dynamically creating a new class type, hence increasing memory consumption. Note that the per-iteration increase is totally negligible ; but with O(10**4) iterations, this starts becoming a thing
It is also possible I'm missing something at this point... what I encourage to do is to take my MFE linked above and trying it again
I would like to make a comment on this issue. Motived by discussion in the thread https://devitocodes.slack.com/archives/C1HSXP6KB/p1643619000223249 I performed several 3D experiments. In a first test (test1.py
),
from examples.seismic.acoustic import AcousticWaveSolver
from examples.seismic import demo_model, setup_geometry
def main():
shape = (151, 151, 151) # 151x151x151 grid
spacing = (10., 10., 10.) # spacing of 10 meters
origin = (0., 0., 0.) # origin
nbl = 0 # number of pad points
tn = 1000
space_order = 8
nshots = 20
# Use a defined preset model
model = demo_model('layers-isotropic', spacing=spacing, space_order=space_order,
shape=shape, nbl=nbl)
geometry = setup_geometry(model, tn)
solver = AcousticWaveSolver(model, geometry, kernel='OT2',
space_order=space_order)
for i in range(nshots):
solver.forward()
if __name__ == "__main__":
main()
I modeled 20 shots in a loop using the forward
method without any arguments and could see the memory growing, although at certain intervals of time the memory is released.
In a second experiment (test2.py
)
from devito import *
from examples.seismic.acoustic import AcousticWaveSolver
from examples.seismic import demo_model, setup_geometry
def main():
shape = (151, 151, 151) # 151x151x151 grid
spacing = (10., 10., 10.) # spacing of 10 meters
origin = (0., 0., 0.) # origin
nbl = 0 # number of pad points
tn = 1000
space_order = 8
nshots = 20
# Use a defined preset model
model = demo_model('layers-isotropic', spacing=spacing, space_order=space_order,
shape=shape, nbl=nbl)
geometry = setup_geometry(model, tn)
solver = AcousticWaveSolver(model, geometry, kernel='OT2',
space_order=space_order)
# Define the wavefield with the size of the model and the time dimension
v = TimeFunction(name="v", grid=model.grid, time_order=2,
space_order=space_order)
for i in range(nshots):
solver.forward(u=v)
if __name__ == "__main__":
main()
I passed a TimeFunction
created out of the loop as an argument to the forward
method and I also observed an increasing in the memory on a smaller scale.
Then I verified that simply by using the op_fwd
method (which returns the Operator
) in the following way: solver.op_fwd().apply(dt=model.critical_dt)
the memory is kept constant (test3.py
).
from examples.seismic.acoustic import AcousticWaveSolver
from examples.seismic import demo_model, setup_geometry
def main():
shape = (151, 151, 151) # 151x151x151 grid
spacing = (10., 10., 10.) # spacing of 10 meters
origin = (0., 0., 0.) # origin
nbl = 0 # number of pad points
tn = 1000
space_order = 8
nshots = 20
# Use a defined preset model
model = demo_model('layers-isotropic', spacing=spacing, space_order=space_order,
shape=shape, nbl=nbl)
geometry = setup_geometry(model, tn)
solver = AcousticWaveSolver(model, geometry, kernel='OT2',
space_order=space_order)
for i in range(nshots):
solver.op_fwd().apply(dt=model.critical_dt)
if __name__ == "__main__":
main()
However I was expecting this to be the case in the second experiment, as it is exactly the same way it is done in the forward
method (https://github.com/devitocodes/devito/blob/master/examples/seismic/acoustic/wavesolver.py#L114).
You can run the tests with a simple command, i.e mprof run --include-children python test2.py
for test 2. The plots are generated with mprof plot mprof.file
where mprfo.file
is the file created once the last command was executed.
Having thinked a little I came to realize that the creation of (Sparse)TimeFunctions
by the forward
method was the cause of the growth in memory. In addition to passing the field u
to forward
, you also should pass rec
and src
, i.e. running test 2 as below the memory profile is flat
src = geometry.src
rec = geometry.rec
for i in range(nshots):
solver.forward(src=src, rec=rec, u=v).
At certain point I considered the possibility of the solver
(i.e. AcousticWaveSolver
) classes were not working as expected, but there are no problems with them. The important take away message is that you must never create TimeFunction
, Function
and SparseTimeFunction
within a loop. I would also say that the forward
method has to be used with special care.
Bug relates to known python bug: https://bugs.python.org/issue9417
ChatGPT summary:
Based on the thread from the Python repository, the issue revolves around circular references created when declaring a class, either using the class
statement or using type()
. The circular reference problem can potentially cause complications with garbage collection, debugging, and performance, especially in certain specialized use cases like high-traffic servers or applications like Django that create temporary classes.
While the problem has been identified and discussed in detail, the consensus among the participants, including some Python core developers, is that fixing it would be complex, possibly leading to a change in public C APIs or introducing slower lookups. Additionally, the idea of adding a new 'unlink' method was not widely accepted.
The key takeaways from the discussion include:
Acknowledgement of Circular References: The circular reference in class creation is acknowledged and understood by core developers. It is seen as a design tradeoff rather than a traditional "bug" in the sense that it is not a discrepancy between documented behavior and actual behavior.
Proposed Solutions and Their Challenges: Various solutions were proposed, such as using weak references or adding a new "unlink" method, but they were met with concerns about performance impact, API changes, and whether they align with Python's design principles.
Potential Workarounds: Some participants suggested the possibility of creating a new metaclass that produces reference-cycle-free classes or adopting practices to manually manage references. However, these were viewed as more experimental and specific to particular use cases rather than suitable for broad adoption.
Issue Status as "Won't Fix": The status of the issue being closed with the resolution "won't fix" indicates that the core developers did not find a satisfactory way to address the problem that would align with Python's broader design goals and performance needs.
Implementation Specificity: The problem seems to be more related to CPython (and possibly PyPy), and the solutions would likely expose implementation details. This also contributes to the resistance to implementing a fix within the standard library.
Historical Context: The thread also makes references to other related issues and discussions, showing that this isn't an isolated incident but part of a more complex ongoing conversation about reference management within Python.
In conclusion, while the circular reference in class creation is recognized, the decision was made not to pursue a fix due to the complexity of the problem, the potential performance impact, and the risk of exposing implementation details. Users experiencing specific issues related to this behavior may need to consider custom solutions tailored to their particular needs and constraints.
More discussion on Stackoverflow: https://stackoverflow.com/questions/22547233/are-dynamically-created-classes-always-unreachable-for-gc-in-python
In current master, there are a few sources of increased memory consumption when operators are created in a loop through the seismic wrapper classes: