sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.33k stars 453 forks source link

Memleak in UniqueRepresentation, @cached_method #12215

Closed vbraun closed 11 years ago

vbraun commented 12 years ago

The documentation says that UniqueRepresentation uses weak refs, but this was switched over to the @cached_method decorator. The latter does currently use strong references, so unused unique parents stay in memory forever:

import sage.structure.unique_representation
len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)

for i in range(2,1000):
    ring = ZZ.quotient(ZZ(i))
    vectorspace = ring^2

import gc
gc.collect()
len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)

Related tickets:

Further notes:

Apply

CC: @simon-king-jena @jdemeyer @mwhansen @vbraun @jpflori

Component: memleak

Keywords: UniqueRepresentation cached_method caching

Author: Simon King

Reviewer: Nils Bruin

Merged: sage-5.7.beta1

Issue created by migration from https://trac.sagemath.org/ticket/12215

simon-king-jena commented 12 years ago

Description changed:

--- 
+++ 
@@ -15,6 +15,7 @@
 Related tickets:
 * #11521 (needs review, introducing weak references for caching homsets), and
 * #715 (needs a lot of work, eventually aiming at using weak references for caching coerce maps). 
+* #5970 (the polynomial rings cache use strong references)

 Further notes:
 * not everything in Python can be weakref'ed, for example ``None`` cannot.
simon-king-jena commented 12 years ago
comment:3

See my comment at #5970: It seems that having a weak version of cached_function (which is used to decorate UniqueRepresentation.__classcall__ is the missing bit (in addition to #11521 and #715 and a two-line change in the polynomial ring constructor) for fixing the issues at #5970.

I think this should be done on top of #11115, which rewrites cached methods and already has a positive review.

simon-king-jena commented 12 years ago

Dependencies: #11115

simon-king-jena commented 12 years ago
comment:5

Here is a patch. It isn't tested yet.

simon-king-jena commented 12 years ago
comment:6

... and I immediately updated the patch: Join categories were not using unique representation but cached_function (by #11900). So, that had to change.

simon-king-jena commented 12 years ago

Changed dependencies from #11115 to #11115 #11900

simon-king-jena commented 12 years ago
comment:7

Sorry, it was impossible to use weak_cached_function on the join function in sage.categories.category, since it may return a list (not weakly referenceable). Hence, I had to work around. With the attached patch (applied on top of #11900 and its dependencies), sage at least starts...

simon-king-jena commented 12 years ago
comment:8

It turns out that all the patches can still not fix the problem. We also have to deal with sage.structure.factory.UniqueFactory.

I suggest to add an option to UniqueFactory, that decides whether a strong or a weak cache is used. And I suggest to do this here, because I don't want to create yet another ticket.

The applications of UniqueFactory should mainly be in cases where weak references work. Therefore I suggest to use the weak cache by default - I am curious how many doc tests will fail...

Coercion sucks.

simon-king-jena commented 12 years ago
comment:9

It turns out that UniqueFactory already was somehow using weak references, but in an improper way. The new patch version replaces that by WeakValueDictionary.

It doesn't solve the problem, though.

simon-king-jena commented 12 years ago
comment:10

I have slightly updated my patch, so that there is no conflict with #11935.

simon-king-jena commented 12 years ago
comment:11

There is yet another location where it makes sense to use @weak_cached_function: For the cache of dynamic classes!

Namely, dynamic classes are frequently used in the category framework, they have a strong cache, and the parent/element classes keep a pointer to the category they belong to. So, that's preventing categories from being garbage collected.

I think that my patches from here, #715, and #11935 (which reduces the number of dynamic classes created) might actually be enough to fix the problem. When I run

sage: for p in primes(2,1000000):
....:     R = GF(p)['x','y','z']
....:     print get_memory_usage()

then one initially still sees an increased memory usage. But after a while it seems to stabilise.

simon-king-jena commented 12 years ago

Description changed:

--- 
+++ 
@@ -14,8 +14,8 @@

Related tickets:

simon-king-jena commented 12 years ago
comment:12

I have updated the patch. It documents the changes, and at least the tests in sage/misc/cachefunc.pyx, in sage/categories/..., in sage/rings/... and in sage/structure/unique_representation.py pass.

Hence, needs review!

simon-king-jena commented 12 years ago

Author: Simon King

simon-king-jena commented 12 years ago

Description changed:

--- 
+++ 
@@ -15,7 +15,7 @@
 Related tickets:
 * #11521 (needs review, introducing weak references for caching homsets), and
 * #715 (using weak references for caching coerce maps). 
-* #5970 (the polynomial rings cache use strong references, which may now be a duplicate, as I introduce the weak cache here)
+* #5970 (the polynomial rings cache use strong references, which may now be a duplicate, as I introduce the weak cache in #715)

 Further notes:
 * not everything in Python can be weakref'ed, for example ``None`` cannot.
simon-king-jena commented 12 years ago

Work Issues: segfaults for elliptic curves

simon-king-jena commented 12 years ago
comment:15

While the tests in sage/categories, sage/rings and sage/structure/unique_representation.py pass, I get some segfaults for the elliptic curve tests. Thus, needs work.

simon-king-jena commented 12 years ago
comment:16

I did sage -t --verbose "devel/sage-main/sage/schemes/elliptic_curves/ell_point.py", and it did not reveal a segfault while running the tests. The test process itself crashed:

830 tests in 54 items.
830 passed and 0 failed.
Test passed.
The doctested process was killed by signal 11
         [23.8 s]

----------------------------------------------------------------------
The following tests failed:

        sage -t --verbose "devel/sage-main/sage/schemes/elliptic_curves/ell_point.py" # Killed/crashed

Strange.

simon-king-jena commented 12 years ago
comment:17

I think I found the problem.

Some doctest of the form

sage: K.residue_field()
<expected answer>

segfaults. But when the result is assigned to a variable, like this

sage: RF = K.residue_field(); RF
<expected answer>

then everything works.

Is it perhaps the case that garbage collection of the residue field (that was enabled by my patch) happens between the creation and the computation of the string representation of the object?

But that is strange. There are variables _ and __, which are supposed to provide strong references to the last two results - hence, there should be no garbage collection.

simon-king-jena commented 12 years ago
comment:18

sage.structure.factory.UniqueFactory did use weak references before. But it did so - I think - improperly, namely without using weakref.WeakValueDictionary. The new patch version changes that.

It isn't ready for review, yet, because of the segfaults.

simon-king-jena commented 12 years ago
comment:19

Some old code is not using the cache: There was some coerce map created in sage/rings/residue_field.pyx, whose parent was not created by Hom(domain,codomain), but directly by RingHomset(domain,codomain).

Changing it fixed at least one segfault. I wish all segfaults would go away so easily...

simon-king-jena commented 12 years ago
comment:20

Fortunately, I now have a short example that triggers a memory access error when leaving Sage:

sage: E = EllipticCurve('15a1')
sage: K.<t>=NumberField(x^2+2*x+10)
sage: EK=E.base_extend(K)
sage: EK.torsion_subgroup()
Torsion Subgroup isomorphic to Z/4 + Z/4 associated to the Elliptic Curve defined by y^2 + x*y + y = x^3 + x^2 + (-10)*x + (-10) over Number Field in t with defining polynomial x^2 + 2*x + 10
sage: quit
Exiting Sage (CPU time 0m1.98s, Wall time 0m52.03s).
local/bin/sage-sage: Zeile 303: 30045 Speicherzugriffsfehler  sage-ipython "$@" -i

However, I wonder how I can trigger the error without leaving Sage, and how I can trace what is going on.

simon-king-jena commented 12 years ago
comment:21

Actually EK._torsion_bound(number_of_places=20) is enough to trigger the memory access error.

vbraun commented 12 years ago
comment:22

Here is the stack:

Program terminated with signal 11, Segmentation fault.
#0  cgetg (y=22, x=<optimized out>) at ../src/kernel/none/level1.h:114
114 ../src/kernel/none/level1.h: No such file or directory.
    in ../src/kernel/none/level1.h
Traceback (most recent call last):
  File "/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.16-gdb.py", line 59, in <module>
    from libstdcxx.v6.printers import register_libstdcxx_printers
  File "/usr/lib64/../share/gcc-4.6.2/python/libstdcxx/v6/printers.py", line 19, in <module>
    import itertools
ImportError: No module named itertools
Missing separate debuginfos, use: debuginfo-install atlas-3.8.4-1.fc16.x86_64 expat-2.0.1-11.fc15.x86_64 fontconfig-2.8.0-4.fc16.x86_64 keyutils-libs-1.5.2-1.fc16.x86_64 krb5-libs-1.9.2-4.fc16.x86_64 libcom_err-1.41.14-2.fc15.x86_64 libselinux-2.1.6-5.fc16.x86_64 ncurses-libs-5.9-2.20110716.fc16.x86_64 openssl-1.0.0e-1.fc16.x86_64
(gdb) bt
#0  cgetg (y=22, x=<optimized out>) at ../src/kernel/none/level1.h:114
#1  convi (x=0x288b2a8, l=0x7fff6a2f8a38) at ../src/kernel/gmp/mp.c:1288
#2  0x00007f11fb1637ec in itostr_sign (x=<optimized out>, sx=1, len=0x7fff6a2f8b48) at ../src/language/es.c:500
#3  0x00007f11fb167b4f in str_absint (x=0x288b2a8, S=0x7fff6a2f8cb0) at ../src/language/es.c:1778
#4  bruti_intern (g=0x288b2a8, T=<optimized out>, S=0x7fff6a2f8cb0, addsign=1) at ../src/language/es.c:2557
#5  0x00007f11fb168453 in bruti_intern (g=0x288b2d8, T=0x7f11fb4b27a0, S=0x7fff6a2f8cb0, addsign=<optimized out>)
    at ../src/language/es.c:2730
#6  0x00007f11fb1679ae in GENtostr_fun (out=0x7f11fb16a7b0 <bruti>, T=0x7f11fb4b27a0, x=0x288b2d8)
    at ../src/language/es.c:1645
#7  GENtostr (x=0x288b2d8) at ../src/language/es.c:1651
#8  0x00007f11f5ae5c44 in gcmp_sage (y=0x583d1b8, x=<optimized out>) at sage/libs/pari/misc.h:60
#9  __pyx_f_4sage_4libs_4pari_3gen_3gen__cmp_c_impl (__pyx_v_left=<optimized out>, __pyx_v_right=<optimized out>)
    at sage/libs/pari/gen.c:8513
#10 0x00007f11f8663227 in __pyx_f_4sage_9structure_7element_7Element__richcmp_c_impl (__pyx_v_left=0x5780e10, 
    __pyx_v_right=<optimized out>, __pyx_v_op=2) at sage/structure/element.c:7775
#11 0x00007f11f86875ec in __pyx_f_4sage_9structure_7element_7Element__richcmp (__pyx_v_left=0x5780e10, 
    __pyx_v_right=0x5863f70, __pyx_v_op=2) at sage/structure/element.c:7498
#12 0x00007f11f5ae045b in __pyx_pf_4sage_4libs_4pari_3gen_3gen_44__richcmp__ (__pyx_v_left=<optimized out>, 
    __pyx_v_right=<optimized out>, __pyx_v_op=<optimized out>) at sage/libs/pari/gen.c:8475
#13 0x00007f1208b32e6a in try_rich_compare (v=0x5780e10, w=0x5863f70, op=2) at Objects/object.c:619
#14 0x00007f1208b3518d in try_rich_compare_bool (op=<optimized out>, w=<optimized out>, v=<optimized out>)
    at Objects/object.c:647
#15 try_rich_to_3way_compare (w=0x5863f70, v=0x5780e10) at Objects/object.c:681
#16 do_cmp (w=0x5863f70, v=0x5780e10) at Objects/object.c:834
#17 PyObject_Compare (v=0x5780e10, w=0x5863f70) at Objects/object.c:863
#18 0x00007f1208af5ae5 in PyObject_Cmp (o1=<optimized out>, o2=<optimized out>, result=0x7fff6a2f8f0c)
    at Objects/abstract.c:41
#19 0x00007f1208b879d4 in builtin_cmp (self=<optimized out>, args=<optimized out>) at Python/bltinmodule.c:422
#20 0x00007f1208b917fd in call_function (oparg=<optimized out>, pp_stack=0x7fff6a2f9000) at Python/ceval.c:3706
#21 PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2389
#22 0x00007f1208b934d9 in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=<optimized out>, 
    args=<optimized out>, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0)
    at Python/ceval.c:2968
#23 0x00007f1208b1f7f6 in function_call (func=0x1fec9b0, arg=0x5864518, kw=0x0) at Objects/funcobject.c:524
#24 0x00007f1208af97a3 in PyObject_Call (func=0x1fec9b0, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#25 0x00007f1208b0667f in instancemethod_call (func=0x1fec9b0, arg=0x5864518, kw=0x0)
    at Objects/classobject.c:2579
#26 0x00007f1208af97a3 in PyObject_Call (func=0x579d0f0, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#27 0x00007f1208b545c6 in half_compare (self=<optimized out>, other=<optimized out>) at Objects/typeobject.c:5253
#28 0x00007f1208b547a5 in _PyObject_SlotCompare (self=0x5713af0, other=0x5866af0) at Objects/typeobject.c:5278
#29 0x00007f1208b35260 in do_cmp (w=0x5866af0, v=0x5713af0) at Objects/object.c:817
#30 PyObject_Compare (v=0x5713af0, w=0x5866af0) at Objects/object.c:863
#31 0x00007f1208af5ae5 in PyObject_Cmp (o1=<optimized out>, o2=<optimized out>, result=0x7fff6a2f955c)
    at Objects/abstract.c:41
#32 0x00007f1208b879d4 in builtin_cmp (self=<optimized out>, args=<optimized out>) at Python/bltinmodule.c:422
#33 0x00007f1208af97a3 in PyObject_Call (func=0x7f120903e2d8, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#34 0x00007f11e5b541fc in __pyx_pf_4sage_5rings_13residue_field_20ResidueField_generic_8__cmp__ (
    __pyx_self=<optimized out>, __pyx_args=<optimized out>, __pyx_kwds=<optimized out>)
    at sage/rings/residue_field.c:7317
#35 0x00007f1208af97a3 in PyObject_Call (func=0x22e7200, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
---Type <return> to continue, or q <return> to quit---
#36 0x00007f1208b0667f in instancemethod_call (func=0x22e7200, arg=0x586d320, kw=0x0)
    at Objects/classobject.c:2579
#37 0x00007f1208af97a3 in PyObject_Call (func=0x4e32aa0, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#38 0x00007f1208b545c6 in half_compare (self=<optimized out>, other=<optimized out>) at Objects/typeobject.c:5253
#39 0x00007f1208b547a5 in _PyObject_SlotCompare (self=0x57f2410, other=0x5906e90) at Objects/typeobject.c:5278
#40 0x00007f1208b35260 in do_cmp (w=0x5906e90, v=0x57f2410) at Objects/object.c:817
#41 PyObject_Compare (v=0x57f2410, w=0x5906e90) at Objects/object.c:863
#42 0x00007f1208af5ae5 in PyObject_Cmp (o1=<optimized out>, o2=<optimized out>, result=0x7fff6a2f99bc)
    at Objects/abstract.c:41
#43 0x00007f1208b879d4 in builtin_cmp (self=<optimized out>, args=<optimized out>) at Python/bltinmodule.c:422
#44 0x00007f1208b917fd in call_function (oparg=<optimized out>, pp_stack=0x7fff6a2f9ab0) at Python/ceval.c:3706
#45 PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2389
#46 0x00007f1208b92593 in fast_function (nk=<optimized out>, na=2, n=<optimized out>, pp_stack=0x7fff6a2f9c10, 
    func=0x3b0a410) at Python/ceval.c:3792
#47 call_function (oparg=<optimized out>, pp_stack=0x7fff6a2f9c10) at Python/ceval.c:3727
#48 PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2389
#49 0x00007f1208b934d9 in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=<optimized out>, 
    args=<optimized out>, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0)
    at Python/ceval.c:2968
#50 0x00007f1208b1f7f6 in function_call (func=0x3af28c0, arg=0x588d7a0, kw=0x0) at Objects/funcobject.c:524
#51 0x00007f1208af97a3 in PyObject_Call (func=0x3af28c0, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#52 0x00007f1208b0667f in instancemethod_call (func=0x3af28c0, arg=0x588d7a0, kw=0x0)
    at Objects/classobject.c:2579
#53 0x00007f1208af97a3 in PyObject_Call (func=0x4e29be0, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#54 0x00007f1208b545c6 in half_compare (self=<optimized out>, other=<optimized out>) at Objects/typeobject.c:5253
#55 0x00007f1208b547a5 in _PyObject_SlotCompare (self=0x579f908, other=0x58c5528) at Objects/typeobject.c:5278
#56 0x00007f1208b34dad in PyObject_RichCompare (v=0x579f908, w=0x58c5528, op=2) at Objects/object.c:967
#57 0x00007f1208b3505f in PyObject_RichCompareBool (v=<optimized out>, w=<optimized out>, op=<optimized out>)
    at Objects/object.c:1001
#58 0x00007f1208b49264 in tuplerichcompare (op=2, w=0x5898a70, v=0x578bf38) at Objects/tupleobject.c:546
#59 tuplerichcompare (v=0x578bf38, w=0x5898a70, op=2) at Objects/tupleobject.c:517
#60 0x00007f1208b34d71 in PyObject_RichCompare (v=0x578bf38, w=0x5898a70, op=2) at Objects/object.c:958
#61 0x00007f1208b3505f in PyObject_RichCompareBool (v=<optimized out>, w=<optimized out>, op=<optimized out>)
    at Objects/object.c:1001
#62 0x00007f1208b49264 in tuplerichcompare (op=2, w=0x5898ab8, v=0x579c368) at Objects/tupleobject.c:546
#63 tuplerichcompare (v=0x579c368, w=0x5898ab8, op=2) at Objects/tupleobject.c:517
#64 0x00007f1208b34d71 in PyObject_RichCompare (v=0x579c368, w=0x5898ab8, op=2) at Objects/object.c:958
#65 0x00007f1208b3505f in PyObject_RichCompareBool (v=<optimized out>, w=<optimized out>, op=<optimized out>)
    at Objects/object.c:1001
#66 0x00007f1208b2f305 in lookdict (mp=0x14c51d0, key=<optimized out>, hash=-1399715627429533172)
    at Objects/dictobject.c:351
#67 0x00007f1208b3087c in PyDict_DelItem (op=0x14c51d0, key=0x5898ab8) at Objects/dictobject.c:742
#68 0x00007f1208b8e924 in PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:1555
#69 0x00007f1208b934d9 in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=<optimized out>, 
    args=<optimized out>, argcount=1, kws=0x0, kwcount=0, defs=0x1458be8, defcount=1, closure=0x0)
    at Python/ceval.c:2968
#70 0x00007f1208b1f7f6 in function_call (func=0x1464320, arg=0x7f1208fdf510, kw=0x0) at Objects/funcobject.c:524
#71 0x00007f1208af97a3 in PyObject_Call (func=0x1464320, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#72 0x00007f1208afa1e0 in PyObject_CallFunctionObjArgs (callable=0x1464320) at Objects/abstract.c:2723
#73 0x00007f1208bc4146 in handle_weakrefs (old=0x7f1208e52b40, unreachable=0x7fff6a2fa700)
---Type <return> to continue, or q <return> to quit---
    at Modules/gcmodule.c:607
#74 collect (generation=2) at Modules/gcmodule.c:859
#75 0x00007f1208bc4b04 in PyGC_Collect () at Modules/gcmodule.c:1292
#76 0x00007f1208bb6d73 in Py_Finalize () at Python/pythonrun.c:424
#77 0x00007f1208bb5c38 in Py_Exit (sts=0) at Python/pythonrun.c:1714
#78 0x00007f1208bb5d2f in handle_system_exit () at Python/pythonrun.c:1116
#79 0x00007f1208bb5fc5 in handle_system_exit () at Python/pythonrun.c:1078
#80 PyErr_PrintEx (set_sys_last_vars=1) at Python/pythonrun.c:1126
#81 0x00007f1208bb643e in PyRun_SimpleFileExFlags (fp=<optimized out>, filename=<optimized out>, closeit=1, 
    flags=0x7fff6a2fa9e0) at Python/pythonrun.c:935
#82 0x00007f1208bc35a3 in Py_Main (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:599
#83 0x00007f1207e7569d in __libc_start_main (main=0x400620 <main>, argc=3, ubp_av=0x7fff6a2fab08, 
    init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff6a2faaf8)
    at libc-start.c:226
#84 0x0000000000400651 in _start ()

For the record, I enabled coredumps and then ran gdb --core core.4522 local/bin/python

simon-king-jena commented 12 years ago
comment:23

Thank you! How does one enable coredumps?

simon-king-jena commented 12 years ago
comment:24

If I understand correctly, the coredump says that it occurs while doing c = cmp(self.p, x.p), where x and self are residue fields.

simon-king-jena commented 12 years ago
comment:25

Yep, I just inserted a print statement before and after the "cmp" line. When leaving sage, the first line was printed, and the segfault happened before printing the second line. Hence, the problem occurs when comparing fractional ideals.

vbraun commented 12 years ago
comment:26

To enable coredumps (at least with bash):

ulimit -c unlimited
simon-king-jena commented 12 years ago
comment:27

Aha! A comparison of two sage.libs.pari.gen.gen happens after _unsafe_deallocate_pari_stack is called, which closes pari. That is, of course, bad.

Only I wonder how the order can be changed. Alternatively, it could be tested before comparison whether pari is still alive. But that would result in a slow-down.

simon-king-jena commented 12 years ago
comment:28

I guess one must make sure that there is a strong reference to the (unique?) pari instance until all sage.libs.pari.gen.gen are deallocated.

simon-king-jena commented 12 years ago
comment:29

pari._unsafe_deallocate_pari_stack is called in sage.all.quit_sage. It does not help to move it to the end of quit_sage. I wonder why it is not put into a proper __del__ method of PariInstance? Is it really needed to be in quit_sage??

simon-king-jena commented 12 years ago
comment:30

Yessss! When removing _unsafe_deallocate_pari_stack from quit_sage and renaming it into a __dealloc__ method, then the segfault vanishes!

simon-king-jena commented 12 years ago
comment:31

Too bad. It fixes the segfault of sage -t sage/rings/number_field/number_field_ideal.py, but it doesn't help for sage -t sage/schemes/elliptic_curves/heegner.py.

Why is it always the elliptic curves code that causes trouble for my patches?

simon-king-jena commented 12 years ago
comment:32

I have attached a second patch, that fixes two or three segfaults - which isn't enough.

vbraun commented 12 years ago
comment:33

In Python one must not use dealloc() to free C resources, at least not unless you are absolutely certain that the Python object does not participate in circular references.

Does it help to do an explicit gc.collect() at the end of quit_sage and only then deallocate Pari? If not we might have to give up clearing the Pari stack...

simon-king-jena commented 12 years ago
comment:34

Replying to @vbraun:

In Python one must not use dealloc() to free C resources, at least not unless you are absolutely certain that the Python object does not participate in circular references.

Do you mean __del__? If I remember correctly, __dealloc__ is Cython, has nothing to do with the ability of the garbage collector to deal with circular references, and it is what one must have if there are C-resources to free after deleting all Python stuff. So, from all what I know, using __dealloc__ (not __del__!) is a clean solution.

Does it help to do an explicit gc.collect() at the end of quit_sage and only then deallocate Pari? If not we might have to give up clearing the Pari stack...

I didn't try.

vbraun commented 12 years ago
comment:35

Yes, you are right: __dealloc__ is ok, __del__ is not.

But the problem seems to be that we finalize Pari before finalizing all Pari elements. Ideally, elements keep their parent alive because they hold a reference but I think GENs are often used in an ad-hoc way in Sage. So moving the Pari finalizer to __dealloc__ just makes it run later, but still gives no guarantees about finalizer ordering.

simon-king-jena commented 12 years ago
comment:36

Pari elements have no parent, if I am not mistaken. Adding a parent means: Create an overhead, namely an additional pointer as part of all Pari elements. I am not sure if the number theorists would like that - one might ask on sage-nt.

vbraun commented 12 years ago
comment:37

There is sage.rings.pari_ring which implements parents and elements. But when Pari is used in the Sage library its usually directly via its C API.

Looking at Python's C API, it seems that Py_AtExit() is what we want: A callback for a cleanup function that is run after Python is finalized. In fact anything from quit_sage() that just finalizes a C library should probably be moved there. See http://docs.python.org/c-api/sys.html

jdemeyer commented 12 years ago
comment:38

Not sure why I was added to "cc". But the newly added doctest in attachment: trac12215_segfault_fixes.patch looks bad because there really should be only one running PariInstance, since global variables are used for the PARI stack (this is the fault of PARI, not of Sage).

simon-king-jena commented 12 years ago
comment:39

Replying to @jdemeyer:

But the newly added doctest in attachment: trac12215_segfault_fixes.patch looks bad because there really should be only one running PariInstance

How else could one test that __dealloc__ works?

jdemeyer commented 12 years ago
comment:40

Replying to @simon-king-jena:

How else could one test that __dealloc__ works?

By Sage not crashing upon exit. I don't see any other way here.

simon-king-jena commented 12 years ago
comment:41

Replying to @jdemeyer:

Replying to @simon-king-jena:

How else could one test that __dealloc__ works?

By Sage not crashing upon exit. I don't see any other way here.

OK. I just thought Sage has a 100% doctest policy.

simon-king-jena commented 12 years ago

Changed work issues from segfaults for elliptic curves to fix it...

simon-king-jena commented 12 years ago
comment:42

With sage-5.0.prealpha0 plus #11780 plus #715 plus #11521 plus #12290, all tests pass. But if one adds the two patches from here, one gets

        sage -t  -force_lib devel/sage/sage/combinat/combinatorial_algebra.py # 4 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/partition.py # 3 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/kschur.py # 17 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/sfa.py # 284 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/macdonald.py # 107 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/hall_littlewood.py # 61 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/llt.py # 50 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/monomial.py # 16 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/orthotriang.py # 25 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/elementary.py # 9 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/homogeneous.py # 9 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/dual.py # 87 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/schur.py # 13 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/ns_macdonald.py # 2 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/powersum.py # 17 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/classical.py # 9 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/jack.py # 35 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/product_species.py # 1 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/composition_species.py # 2 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/functorial_composition_species.py # 3 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/generating_series.py # 44 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/library.py # 4 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/species.py # 2 doctests failed
        sage -t  -force_lib devel/sage/sage/libs/pari/gen.pyx # Killed/crashed

Hopefully most of these errors have a common root.

simon-king-jena commented 12 years ago
comment:43

It seems that a good deal of the errors comes from a method sage.combinat.sf.sf.SymmetricFunctions.register_isomorphism: It registers a coercion, but this is only possible when no coercion has been established for that object before.

What should one do: Catch the error and 'not'' registering the coercion? Or wipe the registered coercions, by calling sage.structure.parent.Parent.unset_coercions_used?

simon-king-jena commented 12 years ago
comment:44

I have to slightly modify my preceding statement. The error is raised not if there has been established a coercion for that object before, but if there has been a coercion registered between the two objects before. Anyway, the problem remains the same.

simon-king-jena commented 12 years ago
comment:45

Here is a very short example triggering the error:

sage: P = JackPolynomialsP(QQ,1)
sage: P([2,1])^2

Hopefully this is short enough for debugging - I find it quite mysterious, so far.

simon-king-jena commented 12 years ago
comment:46

Something else is interesting: The error changes when repeating it.

sage: P = JackPolynomialsP(QQ,1)
sage: p = P([2,1])
sage: p^2
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (56, 0))

ERROR: Internal Python error in the inspect module.
Below is the traceback from this internal error.

Traceback (most recent call last):
...
/home/simon/SAGE/sage-5.0.prealpha0/local/lib/python2.7/site-packages/sage/combinat/sf/sf.pyc in register_isomorphism(self, morphism)
    324         mathematically wrong, as above. Use with care!
    325         """
--> 326         morphism.codomain().register_coercion(morphism)
    327 
    328     _shorthands = set(['e', 'h', 'm', 'p', 's'])

/home/simon/SAGE/sage-5.0.prealpha0/local/lib/python2.7/site-packages/sage/structure/parent.so in sage.structure.parent.Parent.register_coercion (sage/structure/parent.c:11955)()

/home/simon/SAGE/sage-5.0.prealpha0/local/lib/python2.7/site-packages/sage/structure/parent.so in sage.structure.parent.Parent.register_coercion (sage/structure/parent.c:11889)()

AssertionError: coercion from Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis to Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis already registered or discovered
sage: p^2
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (56, 0))

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
...
/home/simon/SAGE/sage-5.0.prealpha0/local/lib/python2.7/site-packages/sage/combinat/sf/sfa.pyc in _from_cache(self, element, cache_function, cache_dict, **subs_dict)
    631             if sum(part) not in cache_dict:
    632                 cache_function(sum(part))
--> 633             for part2, c2 in cache_dict[sum(part)][part].iteritems():
    634                 c3 = c*c2
    635                 if hasattr(c3,'subs'): # c3 may be in the base ring

KeyError: [2, 1]

Cc to the author of sage/combinat/sf/jack.py.

simon-king-jena commented 12 years ago
comment:47

I inserted some print statement into the register_isomorphism method of symmetric functions. I found that with or without the patch, the isomorphisms are registered both during initialisation of the JackPolynomialsP and before raising an element to a power for the first time:

sage: P = JackPolynomialsP(QQ,1)
registering Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Power symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Power symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Power symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Power symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Power symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Power symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Power symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Power symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis
sage: p = P([2,1])
sage: p^2
registering Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis
JackP[2, 2, 1, 1] + JackP[2, 2, 2] + JackP[3, 1, 1, 1] + 2*JackP[3, 2, 1] + JackP[3, 3] + JackP[4, 1, 1] + JackP[4, 2]
sage: p^2
JackP[2, 2, 1, 1] + JackP[2, 2, 2] + JackP[3, 1, 1, 1] + 2*JackP[3, 2, 1] + JackP[3, 3] + JackP[4, 1, 1] + JackP[4, 2]

This gives rise to some questions:

I guess, the best solution would be to address the first question: Registering the same thing twice is a waste or resources anyway.