sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.44k stars 480 forks source link

Memleak in UniqueRepresentation, @cached_method #12215

Closed vbraun closed 11 years ago

vbraun commented 12 years ago

The documentation says that UniqueRepresentation uses weak refs, but this was switched over to the @cached_method decorator. The latter does currently use strong references, so unused unique parents stay in memory forever:

import sage.structure.unique_representation
len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)

for i in range(2,1000):
    ring = ZZ.quotient(ZZ(i))
    vectorspace = ring^2

import gc
gc.collect()
len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)

Related tickets:

Further notes:

Apply

CC: @simon-king-jena @jdemeyer @mwhansen @vbraun @jpflori

Component: memleak

Keywords: UniqueRepresentation cached_method caching

Author: Simon King

Reviewer: Nils Bruin

Merged: sage-5.7.beta1

Issue created by migration from https://trac.sagemath.org/ticket/12215

simon-king-jena commented 12 years ago
comment:88

Hooray! It turns out that one can fix the failing doctest in sf.py by not only providing a strongly cached sf.SymmetricFunctions.__classcall__, but by additionally providing a strongly cached sfa.SymmetricFunctionAlgebra_generic.__classcall__.

It is a bit unfortunate that a strong cache creeps back in, but apparently the assumption of strong caching is extensively used in sage.combinat.sf. Having weakly cached unique representation everywhere except in sage.combinat.sf is at least something...

I am now running the full testsuite with the new modification.

simon-king-jena commented 12 years ago
comment:89

Now the second patch has been updated as well. As I have announced, the second patch is now not only introducing a strong custom cache for SymmetricFunctions, but also for the different representations, i.e., for SymmetricFunctionAlgebra_generic. It is certainly not ideal that symmetric functions need to be strongly cached, but I think that there may be a different ticket to resolve this special case.

Anyway, with sage-5.0.beta13 plus #715, #11521, #12313, #11943, #11935 and the two patches from here, all doctests pass. I am confident that they would also pass when one only has 5.0.beta13 plus the two patches from here.

Needs review, then!

simon-king-jena commented 12 years ago

Changed work issues from Keep the fix from #12313. Coercion in symmetric function algebras to none

simon-king-jena commented 12 years ago

Work Issues: rebase wrt #12808

simon-king-jena commented 12 years ago

Changed dependencies from #11115 #11900 #12645 #11599 to #11115 #11900 #12645 #11599 #12215

simon-king-jena commented 12 years ago
comment:90

There is a conflict with #12808, which has a positive review. Hence, we need to rebase it!

nthiery commented 12 years ago
comment:91

Replying to @simon-king-jena:

There is a conflict with #12808, which has a positive review. Hence, we need to rebase it!

(Trivial) rebase done in the updated patch (at the end of the day, I needed it urgently, so I just took over the task).

simon-king-jena commented 12 years ago

Changed work issues from rebase wrt #12808 to none

simon-king-jena commented 12 years ago

Changed dependencies from #11115 #11900 #12645 #11599 #12215 to #11115 #11900 #12645 #11599 #12808

simon-king-jena commented 12 years ago
comment:92

Thank you, Nicolas! I planned to do the change today.

simon-king-jena commented 12 years ago
comment:93

... and I guess it needs review, again?

simon-king-jena commented 12 years ago
comment:94

Anne mentioned on combinat-devel that the patch needs to be rebased because of #7980.

simon-king-jena commented 12 years ago

Changed dependencies from #11115 #11900 #12645 #11599 #12808 to #11115 #11900 #12645 #11599 #12808 #7980

simon-king-jena commented 12 years ago

Work Issues: Rebase wrt #7980

simon-king-jena commented 12 years ago

Description changed:

--- 
+++ 
@@ -23,6 +23,6 @@

 __Apply__

-* [attachment: trac12215_weak_cached_function.patch](https://github.com/sagemath/sage/files/ticket12215/trac12215_weak_cached_function.patch.gz)
+* [attachment: trac12215_weak_cached_function-sk.patch](https://github.com/sagemath/sage/files/ticket12215/trac12215_weak_cached_function-sk.patch.gz)
 * [attachment: trac12215_segfault_fixes.patch](https://github.com/sagemath/sage/files/ticket12215/trac12215_segfault_fixes.patch.gz)
simon-king-jena commented 12 years ago
comment:95

The patches are now rebased rel #7980. I have not been able to replace my first patch, because trac believes that it isn't my patch but Nicolas'...

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

simon-king-jena commented 12 years ago

Changed work issues from Rebase wrt #7980 to none

simon-king-jena commented 12 years ago
comment:97

I don't know how, but I managed to mess up the patch. Now it should work.

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

simon-king-jena commented 12 years ago
comment:98

When applying

trac12969_fix_coercion_cache.patch
trac12215_weak_cached_function-sk.patch
trac12215_segfault_fixes.patch
trac_715_combined.patch
trac_11521_homset_weakcache_combined.patch
trac_12313-mono_dict-combined-random-sk.patch

on top of sage-5.2.rc0, all tests on bsd.math pass.

simon-king-jena commented 12 years ago
comment:99

Alas. The second patch does not apply to sage-5.3.beta2. Needs work.

simon-king-jena commented 12 years ago

Work Issues: Rebase rel sage-5.3.beta2

simon-king-jena commented 12 years ago
comment:100

Apparently it is a very severe conflict with sage/combinat/sf/sf.py. Bad, apparently that file has totally changed. I can't even recognise where the patch was supposed to be applied.

simon-king-jena commented 12 years ago
comment:101

Perhaps the changes to sage/combinat/sf/sf.py are not needed? I have replaced the second patch by one that does not touch sf.py (in particular, it does not introduce a strong cache to symmetric function algebras). If that won't work, I can still put the necessary changes into a third patch...

For the moment:

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

simon-king-jena commented 12 years ago

Changed work issues from Rebase rel sage-5.3.beta2 to none

simon-king-jena commented 12 years ago
comment:102

Arrgh. The first patch needs two little changes in the documentation. There were two lines in a doc string that started with an indentation, which was a typo.

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

nbruin commented 12 years ago
comment:103

Hi Simon,

I've read the patches to see if there is anything weird or questionable. It looks good for the most part. Just some things you might be able to comment on:

main patch:

sage/categories/category.py

You make some changes to _join, which has the note

It is used for getting a temporary speed-up at trac ticket #11900.
But it is supposed to be replaced by a better solution at trac
ticket #11943.

and #11943 was merged in 5.1! Is this function still used?

Otherwise: Looks good!

sefault_fixes:

sage/combinat/sf/sf.py

There is still a change to that file, namely, you add "from sage.misc.cachefunc import cached_function"

which is an odd thing to do in one patch by itself. Perhaps it's a change you failed to revert?

INCREF:

scary stuff ... I don't like black magic in code, but you seem to know what you're doing...

Conclusion:

I'd be close to giving a positive review. I haven't run test suits (yet), but you've done a lot of that and the bot should be doing that anyway. The changes themselves look fairly straightforward.

simon-king-jena commented 12 years ago
comment:104

Hi Nils,

Replying to @nbruin:

main patch:

sage/categories/category.py

You make some changes to _join, which has the note

It is used for getting a temporary speed-up at trac ticket #11900.
But it is supposed to be replaced by a better solution at trac
ticket #11943.

and #11943 was merged in 5.1! Is this function still used?

Thank you for spotting it. The plan was to replace it, but apparently it wasn't done. So, I removed the note.

sefault_fixes:

sage/combinat/sf/sf.py

There is still a change to that file, namely, you add "from sage.misc.cachefunc import cached_function"

which is an odd thing to do in one patch by itself. Perhaps it's a change you failed to revert?

I have removed the import. I also removed the strongly cached classcall from sage/combinat/sf/sfa.py: it turns out that a strong cache for symmetric function algebras is not needed any more, after the recent overhaul.

INCREF:

scary stuff ...

I know. I am now testing whether it it still needed.

Anyway, it should now be ready for review. I think, to be on the safe side, we should use #11521 as a dependency, which meanwhile has a positive review (thank you!).

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

simon-king-jena commented 12 years ago

Changed dependencies from #11115 #11900 #12645 #11599 #12808 #7980 to #11115 #11900 #12645 #11599 #12808 #7980 #11521

simon-king-jena commented 12 years ago

Work Issues: Do not INCREF when deallocating

simon-king-jena commented 12 years ago
comment:106

Great! The scary INCREF apparently is not needed!! All tests in sage/schemes and sage/sf pass. The following two examples previously resulted in a segfault, but now they don't:

sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 3, 10)
[]
sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 5, 10)
[q - 2*q^3 - 2*q^5 + 4*q^7 - q^9 + O(q^10)]
sage: quit
Exiting Sage (CPU time 0m0.36s, Wall time 0m32.82s).
sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 5, 10)
[q - 2*q^3 - 2*q^5 + 4*q^7 - q^9 + O(q^10)]
sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 3, 10)
[]
sage: quit
Exiting Sage (CPU time 0m0.29s, Wall time 0m4.68s).

So, I will (a bit later today) update the segfault patch, fortunately removing the odd INCREF.

simon-king-jena commented 12 years ago

Changed work issues from Do not INCREF when deallocating to none

simon-king-jena commented 12 years ago
comment:107

The INCREF is gone!

Note that in the commit message of the old patch version, I mention a segfault that occurs in combination with #12313. But I think - if the segfault still occurs - it should be dealt with there.

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

nbruin commented 12 years ago
comment:108

Are you sure you produced these patches against a clean 5.3beta1? Your attachment: trac12215_segfault_fixes.patch still has a (trivial) hunk for sage/combinat/sf/sfa.py which fails to apply for me. If that were the only problem you could just remove that hunk, but I'm also getting quite some doctest failures along the lines of comment:42 (but the pari test is fine now!)

I confirm that I'm not seeing any of the problems that necessitated the INCREF hack, so you're good for that. I'm doubtful that the combinatorics stuff is fixed, though ...

simon-king-jena commented 12 years ago
comment:109

Replying to @nbruin:

Are you sure you produced these patches against a clean 5.3beta1?

I am sure that I produced these patches against 5.3.beta2 (not beta1). I don't know if Jeroen did some last minute changes to beta2; I downloaded it two days ago.

simon-king-jena commented 12 years ago
comment:110

I guess #5457 needs to be added as a dependency. It rewrites the symmetric function code, and was merged in beta2.

simon-king-jena commented 12 years ago

Changed dependencies from #11115 #11900 #12645 #11599 #12808 #7980 #11521 to #11115 #11900 #12645 #11599 #12808 #7980 #11521 #5457

simon-king-jena commented 12 years ago
comment:111

The patchbot reports:

sage -t  -force_lib devel/sage-12215/sage/structure/coerce.pyx
**********************************************************************
File "/opt/patchbot-5.3.beta2/devel/sage-12215/sage/structure/coerce.pyx", line 179:
    sage: cm.get_stats()
Expected:
    ((0, 1.0, 4), (0, 0.25, 1))
Got:
    ((0, 1.0, 4), (0, 0.0, 0))

Since I do not get that error, it seems that the distribution of data into the buckets is machine dependent. Hence, I suggest to mark that test as random.

simon-king-jena commented 12 years ago

Work Issues: Fix one doc test

simon-king-jena commented 12 years ago
comment:112

Replying to @simon-king-jena:

Since I do not get that error,

Wrong! I do get the same error. Hence, needs work.

simon-king-jena commented 12 years ago
comment:113

I fixed the doc test (by modifying the second patch).

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

simon-king-jena commented 12 years ago

Changed work issues from Fix one doc test to none

nbruin commented 12 years ago
comment:114

Good! indeed, with the new dependency things apply cleanly.

sage/structure/coerce.pyx

Agreed. Things get placed in buckets based on address. I'd expect that to depend on not just the machine but even on the run!

sage/combinat/integer_vectors_mod_permgroup.py

That one times out for me too. With --verbose I get that all 271 tests pass but then it hangs on

A workspace appears to have been corrupted... automatically rebuilding (this is harmless).

which explains a timeout. That might just been my setup, but since one of the bots experienced a similar problem, perhaps this needs attention.

I've seen some startup time failures on sage.math, but the machine is rather loaded. I don't think any of the #5457, #12215 patches are individually responsible for slowdowns.

So at this point, it's just technical bits -- nothing intrinsic about the code proposed here.

simon-king-jena commented 12 years ago
comment:115

When I last changed the second patch, I modified the expected result of some cm.get_stats() test. However, I forgot to mark the result "random". You seem to agree that it is indeed random (depending on memory addresses). Hence, I just changed the second patch accordingly.

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

nbruin commented 12 years ago
comment:116

Bot's only complaint:

sage -t  -force_lib devel/sage-12215/sage/sat/boolean_polynomials.py # Killed/crashed

I cannot confirm this error on a clean 5.3b2 patched up to #12313. It pretty much has to be some assert failing in polybori, because I think that's the only component called from there.

simon-king-jena commented 12 years ago
comment:117

The log says:

sage -t  -force_lib devel/sage-12215/sage/sat/boolean_polynomials.py
The doctested process was killed by signal 6

Has this anything to do with my patch? What is signal 6?

nbruin commented 12 years ago

Reviewer: Nils Bruin

jdemeyer commented 12 years ago

Changed dependencies from #11115 #11900 #12645 #11599 #12808 #7980 #11521 #5457 to #11521

simon-king-jena commented 12 years ago
comment:122

There were problems with #715+#11521 on OS X. It is solved by #13447. Hence, moving the dependency...

simon-king-jena commented 12 years ago

Changed dependencies from #11521 to #13447

jdemeyer commented 12 years ago
comment:123

Are the dependencies of this ticket correct, does it really depend on #13447?