firedrakeproject / firedrake

Firedrake is an automated system for the portable solution of partial differential equations using the finite element method (FEM)
https://firedrakeproject.org
Other
482 stars 157 forks source link

When calculating with Firedrake, the calculation is aborted. How do I resolve it? #3624

Closed qk11853 closed 2 weeks ago

qk11853 commented 2 weeks ago

When calculating with Firedrake, the calculation is aborted. How do I resolve it?

30%|██▉ | 9793/32768 [57:51<2:15:43, 2.82it/s] Traceback (most recent call last): File "/home/Math/firedrake/src/firedrake/firedrake/interpolation.py", line 116, in interpolate assembled_interpolator = self.frozen_assembled_interpolator AttributeError: 'Interpolator' object has no attribute 'frozen_assembled_interpolator'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/Math/firedrake/src/PyOP2/pyop2/global_kernel.py", line 314, in call func = self._func_cache[key] KeyError: 136892867291248

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/Math/firedrake/src/PyOP2/pyop2/compilation.py", line 335, in get_so return ctypes.CDLL(soname) File "/usr/lib/python3.10/ctypes/init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: /home/zhan/Desktop/Zhan/6d/0de3c0e4ecf8614b3892628044d2f2.so: cannot apply additional memory protection after relocation: Cannot allocate memory

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/zhan/space.py", line 94, in solver.solve() File "petsc4py/PETSc/Log.pyx", line 115, in petsc4py.PETSc.Log.EventDecorator.decorator.wrapped_func File "petsc4py/PETSc/Log.pyx", line 116, in petsc4py.PETSc.Log.EventDecorator.decorator.wrapped_func File "/home/Math/firedrake/src/firedrake/firedrake/adjoint/variational_solver.py", line 90, in wrapper out = solve(self, kwargs) File "/home/Math/firedrake/src/firedrake/firedrake/variational_solver.py", line 264, in solve dbc.apply(self._problem.u) File "petsc4py/PETSc/Log.pyx", line 115, in petsc4py.PETSc.Log.EventDecorator.decorator.wrapped_func File "petsc4py/PETSc/Log.pyx", line 116, in petsc4py.PETSc.Log.EventDecorator.decorator.wrapped_func File "/home/Math/firedrake/src/firedrake/firedrake/adjoint/dirichletbc.py", line 32, in wrapper ret = apply(self, *args, *kwargs) File "/home/Math/firedrake/src/firedrake/firedrake/bcs.py", line 431, in apply r.assign(self.function_arg, subset=self.node_set) File "/home/Math/firedrake/src/firedrake/firedrake/bcs.py", line 297, in function_arg self._function_arg_update() File "petsc4py/PETSc/Log.pyx", line 115, in petsc4py.PETSc.Log.EventDecorator.decorator.wrapped_func File "petsc4py/PETSc/Log.pyx", line 116, in petsc4py.PETSc.Log.EventDecorator.decorator.wrapped_func File "/home/Math/firedrake/src/firedrake/firedrake/adjoint/interpolate.py", line 22, in wrapper output = interpolate(interpolator, function, kwargs) File "/home/Math/firedrake/src/firedrake/firedrake/interpolation.py", line 119, in interpolate assembled_interpolator = self.callable() File "/home/Math/firedrake/src/firedrake/firedrake/interpolation.py", line 231, in callable l() File "/home/Math/firedrake/src/PyOP2/pyop2/parloop.py", line 206, in compute self() File "petsc4py/PETSc/Log.pyx", line 115, in petsc4py.PETSc.Log.EventDecorator.decorator.wrapped_func File "petsc4py/PETSc/Log.pyx", line 116, in petsc4py.PETSc.Log.EventDecorator.decorator.wrapped_func File "/home/Math/firedrake/src/PyOP2/pyop2/parloop.py", line 215, in call self._compute(self.iterset.core_part) File "/home/Math/firedrake/src/PyOP2/pyop2/parloop.py", line 197, in _compute self.global_kernel(self.comm, part.offset, part.offset+part.size, *self.arglist) File "/home/Math/firedrake/src/PyOP2/pyop2/global_kernel.py", line 316, in call func = self.compile(comm) File "petsc4py/PETSc/Log.pyx", line 115, in petsc4py.PETSc.Log.EventDecorator.decorator.wrapped_func File "petsc4py/PETSc/Log.pyx", line 116, in petsc4py.PETSc.Log.EventDecorator.decorator.wrapped_func File "/home/Math/firedrake/src/PyOP2/pyop2/global_kernel.py", line 387, in compile return compilation.load(self, extension, self.name, File "/home/Math/firedrake/src/PyOP2/pyop2/compilation.py", line 606, in load dll = compiler(cppargs, ldargs, cpp=cpp, comm=comm).get_so(code, extension) File "/home/Math/firedrake/src/PyOP2/pyop2/compilation.py", line 415, in get_so return ctypes.CDLL(soname) File "/usr/lib/python3.10/ctypes/init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: /home/zhan/Desktop/Zhan/6d/0de3c0e4ecf8614b3892628044d2f2.so: cannot apply additional memory protection after relocation: Cannot allocate memory

connorjward commented 2 weeks ago

OSError: /home/zhan/Desktop/Zhan/6d/0de3c0e4ecf8614b3892628044d2f2.so: cannot apply additional memory protection after relocation: Cannot allocate memory

Your computer appears to be running out of memory. If you use a system resources viewer like htop you would be able to verify this.

To resolve this you could either try running on a machine with more memory or reduce the problem size. It's also possible that you have introduced some sort of memory leak breaking Python's ability to garbage collect old objects. Can you provide some more information about the simulation you are running?

wence- commented 2 weeks ago

That error usually occurs if you have a timestepping (or similar) loop, and are doing some assignment or computation that has a dependence on a literal numeric value.

For example, a typical thing you might think to write is:

t = 0
while t < t_end:
    t += dt
    some_form_with_t_in_it = ... t ...
    assemble(some_form_with_t_in_it) # or solve, or interpolate, ...

This will provoke compilation of a new kernel on every iteration and eventually produce this error.

The correct usage is to wrap literal constant values that are updated in a loop like this in a Constant, and use assign:

t = Constant(...)
some_form_with_t_in_it = ... t ...
while float(t) < t_end:
    t.assign(t + dt)
    assemble(some_form_with_t_in_it)
qk11853 commented 2 weeks ago

That error usually occurs if you have a timestepping (or similar) loop, and are doing some assignment or computation that has a dependence on a literal numeric value.

For example, a typical thing you might think to write is:

t = 0
while t < t_end:
    t += dt
    some_form_with_t_in_it = ... t ...
    assemble(some_form_with_t_in_it) # or solve, or interpolate, ...

This will provoke compilation of a new kernel on every iteration and eventually produce this error.

The correct usage is to wrap literal constant values that are updated in a loop like this in a Constant, and use assign:

t = Constant(...)
some_form_with_t_in_it = ... t ...
while float(t) < t_end:
    t.assign(t + dt)
    assemble(some_form_with_t_in_it)

Could you please give me more details of the code? I don't quite understand it. Thank you very much.

connorjward commented 2 weeks ago

That error usually occurs if you have a timestepping (or similar) loop, and are doing some assignment or computation that has a dependence on a literal numeric value. For example, a typical thing you might think to write is:

t = 0
while t < t_end:
    t += dt
    some_form_with_t_in_it = ... t ...
    assemble(some_form_with_t_in_it) # or solve, or interpolate, ...

This will provoke compilation of a new kernel on every iteration and eventually produce this error. The correct usage is to wrap literal constant values that are updated in a loop like this in a Constant, and use assign:

t = Constant(...)
some_form_with_t_in_it = ... t ...
while float(t) < t_end:
    t.assign(t + dt)
    assemble(some_form_with_t_in_it)

Could you please give me more details of the code? I don't quite understand it. Thank you very much.

The core difference here is what is getting passed to assemble(...) (though the same logic would apply to solve(...) and other Firedrake functions).

If you are assembling something with a literal number like 1.68 * u * v * dx then this number will get embedded in the generated code, so if you change the number to, say, 1.72 * u * v * dx then brand new code will be generated. If you do this a lot such as in a timestepping loop, then you can end up having the sorts of issues that you describe.

The trick to fixing this is to wrap the literal value in a Firedrake Constant type. E.g. Constant(1.68) * u * v * dx. Then if you later assemble Constant(1.72) * u * v * dx then no new code is generated.

This is described here in the manual.

qk11853 commented 2 weeks ago

That error usually occurs if you have a timestepping (or similar) loop, and are doing some assignment or computation that has a dependence on a literal numeric value. For example, a typical thing you might think to write is:

t = 0
while t < t_end:
    t += dt
    some_form_with_t_in_it = ... t ...
    assemble(some_form_with_t_in_it) # or solve, or interpolate, ...

This will provoke compilation of a new kernel on every iteration and eventually produce this error. The correct usage is to wrap literal constant values that are updated in a loop like this in a Constant, and use assign:

t = Constant(...)
some_form_with_t_in_it = ... t ...
while float(t) < t_end:
    t.assign(t + dt)
    assemble(some_form_with_t_in_it)

Could you please give me more details of the code? I don't quite understand it. Thank you very much.

The core difference here is what is getting passed to assemble(...) (though the same logic would apply to solve(...) and other Firedrake functions).

If you are assembling something with a literal number like 1.68 * u * v * dx then this number will get embedded in the generated code, so if you change the number to, say, 1.72 * u * v * dx then brand new code will be generated. If you do this a lot such as in a timestepping loop, then you can end up having the sorts of issues that you describe.

The trick to fixing this is to wrap the literal value in a Firedrake Constant type. E.g. Constant(1.68) * u * v * dx. Then if you later assemble Constant(1.72) * u * v * dx then no new code is generated.

This is described here in the manual.

Thank you very much for your patient answer!