Open kliem opened 4 years ago
Some fix should be easy. Just use try ... except ... finally
to properly clean the list at failure.
However, I don't understand entirely how cdef list
is supposed to behave.
I would expect that cython has a chunk of "exit code" for each function that decrefs all the local variables of python-object types ("list" and "Integer" would be). These are the types that cython manages references for (<PyObject *>
is not). If sigcheck
doesn't hook into that code then this memory leak is rather fundamental, and then sigcheck
for simple things like keyboard interrupts is much more complicated to get right -- you'd basically always need a try...finally around it to explicitly delete all the assigned-to local variables to avoid memory leaks!
In the example code, the sorted
list probably leaks much more notably but the apn
Integer should leak just the same (only with much smaller increments). If this is indeed the case, then using sigcheck
correctly (meaning here: in such a way you don't create memory leaks) would be incredibly onerous. This should be run by the cysignals people, because it would be much more preferable to resolve this in a way that doesn't need a try/finally around each sigcheck
.
This seems to be cysignals unrelated. Instead the garbage collection breaks when registering a keyboard interrupt in sage permanently.
Define foo
as follows:
sage: cython('''
....: from sage.rings.integer cimport Integer
....: from sage.ext.stdsage cimport PY_NEW
....: def foo():
....: cdef list sorted = []
....: cdef Integer apn
....: cdef size_t i
....: for i in range(100000000):
....: apn = <Integer>PY_NEW(Integer)
....: sorted.append(apn)
....: if i == 10000000:
....: raise ValueError()
....: ''')
I can run foo
many times and it never leaks. So far so good.
Now I keyboard interrupt foo
. It runs until the end and then raises:
ValueError Traceback (most recent call last)
/srv/public/kliem/sage/local/lib/python3.7/site-packages/IPython/core/interactiveshell.py in run_code(self, code_obj, result, async_)
3330 else:
-> 3331 exec(code_obj, self.user_global_ns, self.user_ns)
3332 finally:
<ipython-input-13-c19b6d9633cf> in <module>
----> 1 foo()
/srv/public/kliem/.sage/temp/cofio/13462/spyx/_srv_public_kliem__sage_temp_cofio_13462_tmp_d9rlh2x7_pyx/_srv_public_kliem__sage_temp_cofio_13462_tmp_d9rlh2x7_pyx_0.pyx in _srv_public_kliem__sage_temp_cofio_13462_tmp_d9rlh2x7_pyx_0.foo()
11 if i == 10000000:
---> 12 raise ValueError()
ValueError:
During handling of the above exception, another exception occurred:
KeyboardInterrupt Traceback (most recent call last)
src/cysignals/signals.pyx in cysignals.signals.python_check_interrupt()
KeyboardInterrupt:
And now things are broken. I can run foo
once or twice and then lost memory starts piling up. Each run about 300 MB.
Even worse. If I keyboard interrupt for some nonsense reason before even cythonizing, then foo
leaks right from the start.
From the following I deduce that really registering an interrupt is the issue, not the error raising itself:
KeyboardInterrupt
, it still collects fine (until I actually it ctr c) some time during my sage sessiontry:
sig_check
except:
raise ValueError
it still leaks, even though I never actually raised KeyboardInterrupt
So apparently the exit code of cython still figures out that all those <PyObject *>
that we created and appended to the list must be deleted. However, once we interrupt once in our sage session, this changes permanently.
Moving to 9.4, as 9.3 has been released.
From https://groups.google.com/g/sage-release/c/g8TLPf61i3A/m/R-SmU68oCAAJ
When canceling obtaining the divisors of a (large) integer, one leaks massive amounts of memory for the following reason:
The following leaks at keyboard interrupt.
Apparently the elements in the list aren't garbage collected until the list is passed back to python.
The corresponding doctest only fails occasionally:
CC: @slel @jdemeyer @strogdon
Component: basic arithmetic
Issue created by migration from https://trac.sagemath.org/ticket/30427