Closed jpflori closed 10 years ago
Sorry, reduction is not to blame. Doing a garbage collection after the first line and then repeating the first line reproduces the crash. Hence, some parent is not correctly kept alive. Anyway, you can't see it, since I didn't post the code yet...
Replying to @nbruin:
Yes of course.
CDF.convert_map_from(Q)
should return a copy equivalent tophi
with strong references to domain and codomain. If the originalphi
is a composition of "weak" (coercion generated) maps then all the components of the returnedphi
should also be strengthened copies.
In other words, you do think that we should distinguish between underscore methods that are used internally in the coercion system and just return the maps, and an "official" interface that returns strong copies. Do I understand correctly?
Concerning compositions, I agree that the parent in the middle should be kept alive by the composed map (even if this map is in the coercion system, hence, domain and codomain are only weakly referenced): If the composed map is kept in memory, then we need to be able to apply the composition, and hence the "man in the middle" needs to be available.
Replying to @simon-king-jena:
In other words, you do think that we should distinguish between underscore methods that are used internally in the coercion system and just return the maps, and an "official" interface that returns strong copies. Do I understand correctly?
Yes, with your approach the maps stored and used in the internals of the coercion systems are not able to stay healthy on their own. They can only survive within a controlled environment. So you cannot let those maps escape into the wild (the royal society for prevention of cruelty to maps would probably have you arrested). I don't see another solution than making a version that is better prepared for the outside world.
Concerning compositions, I agree that the parent in the middle should be kept alive by the composed map (even if this map is in the coercion system, hence, domain and codomain are only weakly referenced): If the composed map is kept in memory, then we need to be able to apply the composition, and hence the "man in the middle" needs to be available.
Yes, you are correct. You might want to check if compositions do tend to occur in the coercion system. They would be quite painful to work with. The natural way of constructing them would be to have a containing map type with _domain,_codomain,_parent
set appropriately, together with a sequence of maps. Those maps would normally be normal, healthy maps with their own strong references to their domains and codomains: a composition would hence carry internally a strong reference to both its domain and codomain (due to the first and last map in the sequence).
The generic way of making a "weak" version of a map would still lead to a map that (internally) keeps strong references to domain and codomain. You'd have to make a custom way of weakening this map. Would you weaken the first and last map in the sequence? Then for a composition of two maps, you'd have to separately keep a reference to the middle (co)domain. Or would you have half-weakened maps as well, that only have a weaked domain or codomain?
That makes me realize a further complication: the copying probably has to happen both ways. Before you prepare a map to become part of the coercion system, you'd have to make sure that the map you're holding is not referenced by anyone outside. Thus, you'd have to make sure that either the origin of the map is guaranteed (it's not a map that is referenced elsewhere--I think this will be impossible to verify in practice, since users can register arbitrary maps as coercions) or you have to make a copy before weakening it (if weakening is an in-place operation, as you proposed above), further upping the cost of coercion discovery. Otherwise, registering a coercion might have the side-effect of weakening a map that someone else is holding already.
(these are the kind of snowballing complications I was afraid of by making a separate type of map suitable for the coercion system)
Replying to @nbruin:
Concerning compositions, I agree that the parent in the middle should be kept alive by the composed map (even if this map is in the coercion system, hence, domain and codomain are only weakly referenced): If the composed map is kept in memory, then we need to be able to apply the composition, and hence the "man in the middle" needs to be available.
Yes, you are correct. You might want to check if compositions do tend to occur in the coercion system.
They do frequently occur, because coercions are found by some kind of backtracking algorithm. But, somehow surprisingly, I don't got the impression that the crashes I am observing come from this.
Anyway, I agree that one should have a strong reference in the middle.
The generic way of making a "weak" version of a map would still lead to a map that (internally) keeps strong references to domain and codomain.
No. You would have weak reference to the domain and codomain, but a strong reference to the middle. A FormalCompositeMap
, by the way, stores two maps __first
and __second
, and in my current experimental code I simply make __first.codomain
a constant function (if it isn't already).
It could in principle mean that the composite map gets deallocated, while __first
stays somewhere else in the coercion system, and now keeps a parent (namely the middle one) alive that may be collectable. I'll worry about it later...
That makes me realize a further complication: the copying probably has to happen both ways. Before you prepare a map to become part of the coercion system, you'd have to make sure that the map you're holding is not referenced by anyone outside. Thus, you'd have to make sure that either the origin of the map is guaranteed (it's not a map that is referenced elsewhere--I think this will be impossible to verify in practice, since users can register arbitrary maps as coercions)
I think the (unwritten, I am afraid) contract is that register_coercion()
is only called in __init__
. So, perhaps it should rather be _register_coercion()
, to remove it from the user interface.
Replying to @simon-king-jena:
I think the (unwritten, I am afraid) contract is that
register_coercion()
is only called in__init__
. So, perhaps it should rather be_register_coercion()
, to remove it from the user interface.
Absolutely not! I think it's a very important feature that coercions can be discovered "lazily", i.e., be registered after the fact. It also means (but this is just a fact of life) that, while parents are supposed to be immutable, their relations in the (global state)! coercion graph can evolve over time.
You could of course have a _register_coercion
for internal use that mandates being passed a map with the promise no-one else will keep a reference to that map, but I'm pretty sure we have to keep an advertised register_coercion
. You could ask sage-devel, of course.
At some point there was even an idea to have a context manager to temporarily modify the coercion graph:
K=QuadraticField(3)
with coercion(K.real_embeddings()[0]):
print 1+K.0
leading to -0.732050807568877
(assuming the first embedding is the negative one).
For basic computations these things are not so essential, but by the time you're a couple levels deep, e.g., you want to compute the archimedean period matrices of some abelian variety defined over a number field, letting the coercion framework to the required conversions might be the only quick way to actually get your data in the right ring. I think we don't want to take away that possibility.
Replying to @simon-king-jena:
It could in principle mean that the composite map gets deallocated, while
__first
stays somewhere else in the coercion system, and now keeps a parent (namely the middle one) alive that may be collectable. I'll worry about it later...
Hm. Any time you have different versions of the same map that may diverge in the strength with which they refer to one of their domain, codomain, parent, etc., you'll need to make a copy to accommodate the divergence in strength. So it probably makes sense to only have two levels: either domain,codomain, and parent are referenced strongly or domain and codomain are referenced weakly (and probably there's no reference to parent at all).
That means that "weak" map compositions in the coercion system need to have a strong reference to the middle domain. That shouldn't be too bad.
Apart from possible efficiency problems, I think this idea can be made to work. The main programming penalty is the added burden on writing new map classes: maps must be able to generate a copy of themselves. You could make this optional: maps unable to copy themselves would be just stored as-is in the coercion framework, with all the memory leak consequences this has. If it's cheap to figure out if maps are weak and/or are capable of generating a weak/strong version of themselves, we could just accommodate both: If a "weakened" map arrives into the coercion framework it can just be used as-is. If a normal map arrives, we see if it can be weakened. If so, we make a weakened copy and store that. Otherwise the map is used as-is.
If a map is about to be passed out of the coercion framework, we check if it's weakened. If not, we can just give out the map itself. Otherwise, we make a strengthened copy and give that. If making a strengthened copy is not possible, we'd have to raise an error.
Of course the main point whether this is acceptable is whether it can be done with little to no increase to overhead compared to now. Costs come at two points
I guess the only way to see whether this is worth it is by trying. At least the semantics are clear. I find it a little scary to weigh down the entire interface for "Maps" but if we can make it opt-in it's perhaps not too much of a burden. (We could just gradually upgrade map classes to be "weakenable" as we find them to be responsible for memory leaks)
I just found something crazy: Apparently the rational field got garbage collected in one of the crashing examples. Hard to believe. But it is in something using modular symbols, which is very old code, predating the unique parent paradigma. So, could be that it calls the constructor of the rational field, rather than a factory or whatever makes the rational field unique.
Replying to @simon-king-jena:
I just found something crazy: Apparently the rational field got garbage collected in one of the crashing examples. Hard to believe.
Fortunately I was mistaken: It is only the case that the __init__
method is called repeatedly on the same object QQ
if you call RationalField()
repeatedly. Namely, it is not using a classcall metaclass, but does the caching in __new__
---but __init__
is always called after __new__
, whether it is cached or not. That's one of the reasons for introducing classcall metaclass.
Perhaps it would be a good idea to change RationalField
to use ClasscallMetaclass
, but of course on a new ticket. Anyway, it is not related to the problem here.
Replying to @simon-king-jena:
Perhaps it would be a good idea to change
RationalField
to useClasscallMetaclass
, but of course on a new ticket.
I created #15247
Too bad. The idea as suggested doesn't actually solve the memory leak; it just makes it less severe (by a constant factor). The problem is: The weakened maps don't prevent their domain from being GCed, but after than happens they linger (now defunct) in _coerce_from
. You'll see that even with your patch in, the example in the ticket description will still eat memory--just a little less quickly. You'll find that CDF._coerce_from_hash
will contain a LOT of entries.
If we were to use a mix of _coerce_from
and _coerce_to
(somehow choosing which one to use) you wouldn't see this problem.
If we really want/need to, we could probably salvage the "weakened map" solution:
_coerce_from
for defunct maps. One possible strategy would be to keep a "reference size" for _coerce_from
and every time we add an entry we check if it is now double the reference size. If it is, we trigger gc, scrub, and reset the reference size. This would at least keep the list bounded in size and hopefully limit the number of expensive scrubs we have to do (treating the bounds exponentially should lead to small amortized costs)I have pushed my branch. I wonder why this did not show up as a post on trac. Anyway, I checked that when clicking on "Commits" then everything is there.
I hesitated to push it before, because it is not ready for review. However, it would be better if we'd all agree what code we are talking about.
Replying to @nbruin:
Too bad. The idea as suggested doesn't actually solve the memory leak; it just makes it less severe (by a constant factor).
I think this is not the case with the branch that I have just uploaded. I did
sage: for D in xrange(2,2**30):
....: print get_memory_usage()
....: Q = QuadraticField(-D)
First, the memory consumption very slowly raised from 213.60546875 to 214.10546875 and then remained steady for several minutes, until I interrupted. And I rather think that the increased consumption was just due to the increased size of D.
The problem is: The weakened maps don't prevent their domain from being GCed, but after than happens they linger (now defunct) in
_coerce_from
. You'll see that even with your patch in, the example in the ticket description will still eat memory--just a little less quickly. You'll find thatCDF._coerce_from_hash
will contain a LOT of entries.
This is clearly not the case with the current commit. After running thousands of cycles in the above "for" loop, I get
sage: CDF._introspect_coerce()
{'_action_hash': <sage.structure.coerce_dict.TripleDict at 0xa26b25c>,
'_action_list': [],
'_coerce_from_hash': <sage.structure.coerce_dict.MonoDict at 0xa26b3e4>,
'_coerce_from_list': [],
'_convert_from_hash': <sage.structure.coerce_dict.MonoDict at 0xa26b41c>,
'_convert_from_list': [],
'_element_init_pass_parent': False,
'_embedding': None,
'_initial_action_list': [],
'_initial_coerce_list': [],
'_initial_convert_list': []}
sage: gc.collect()
2213
sage: for k,v in CDF._introspect_coerce().iteritems():
if v:
print k, len(v)
....:
_coerce_from_hash 4
_convert_from_hash 3
If we were to use a mix of
_coerce_from
and_coerce_to
(somehow choosing which one to use) you wouldn't see this problem.
I don't see this problem anyway.
If we really want/need to, we could probably salvage the "weakened map" solution:
- we could install a callback on the weakrefs.
What weakrefs are you talking about? Those to domain and codomain?
And what would the callback be supposed to do?
- defunct maps are easy to recognize: they have a dead weakref in their domain. We could just periodically scrub
_coerce_from
for defunct maps.
Well, with the current commit, if a map becomes defunct by some garbage collection then it is removed from the _coerce_from_hash
by the same garbage collection, hence, it will not linger around.
A different story is _coerce_from_list
, which is a list. Here, we might need to take care of defunct maps. Perhaps the maps there shouldn't be weakened in the first place (_coerce_from_list
is filled by register_coercion()
, and perhaps it would make sense to keep these maps strong), but I am not sure if this wouldn't re-introduce the QuadraticField
leak I have just fixed.
One possible strategy would be to keep a "reference size" for
_coerce_from
and every time we add an entry we check if it is now double the reference size. If it is, we trigger gc, scrub, and reset the reference size.
As I have stated above, I don't think there is a problem with _coerce_from_hash
accumulating garbage. But _coerce_from_list
may contain garbage, of limited size, because it is only added to by explicit calls to register_coercion()
.
I think a better strategy would be to make discover_coercion
check whether it meets a defunct map when it performs the backtrack algorithm.
Anyway, here is the problem I am currently having:
sage: E = ModularSymbols(11).2
sage: s = E.modular_symbol_rep()
sage: del E,s
sage: import gc
sage: gc.collect()
1309
sage: E = ModularSymbols(11).2
sage: v = E.manin_symbol_rep()
sage: c,x = v[0]
sage: y = x.modular_symbol_rep()
sage: y.parent()
Abelian Group of all Formal Finite Sums over Integer Ring
sage: A = y.parent().get_action(QQ, self_on_left=False, op=operator.mul)
sage: A.right_domain()
Abelian Group of all Formal Finite Sums over Integer Ring
sage: A.left_domain()
Rational Field
sage: A.codomain()
Traceback (most recent call last):
...
RuntimeError: This action acted on a set that became garbage collected
Hence, If I understand correctly, the "Formal Finite Sums over Rational Field" became garbage collected.
And that's where action differ from maps:
In the case of coercion maps, we have domain and codomain, which together are used as weak cache keys for the map. Hence, if either domain or codomain become garbage, then the map becomes invalid but is immediately removed from the cache and will thus not hurt.
In the case of actions, we have left domain and right domain, which together are used as weak cache keys for the action. However, in addition to the two domains, an action has a codomain, which currently is weak referenced (this was in order to fix some memory leak). If the codomain is distinct from both left and right domain and if the codomain becomes garbage, then the action becomes invalid, but it would NOT be removed from the cache, as long as domain and codomain are no garbage.
I am afraid that having a strong reference to the codomain of all actions will re-introduce a memory leak fixed elsewhere. But in the above analysis, you may notice that there only is a problem if the codomain is distinct from both left and right domain. Hence, we could be clever and have a strong reference to the codomain only in this case, and a weak reference otherwise.
I will try this now...
I just learnt that the codomain of an action coincides with the set that is acted upon. But here, we have a sage.categories.action.PrecomposedAction
. So, it composes maps phi from left and psi from right domain with an action alpha that knows about the codomains of phi and psi only. And thus perhaps we have again the problem of keeping "the middle parent" alive.
Namely, if the underlying set S of alpha is the codomain of psi, but psi is weak, then neither psi nor alpha will keep S alive. But S is sometimes not used as cache key for the precomposed action: Only the domains of psi and phi appear in the key. Hence, I think I just need to add a strong reference to the underlying set of alpha.
The proposed change does in fact fix the problem mentioned in comment:63
I have pushed the new commit. The example from comment:63 became a doctest.
I am now running make ptest. Let us see how many problems persist. Depending on the result, we may worry later whether or not we want to add a safe user interface to the coercion system.
Note that I have retested
sage: for D in xrange(2,2**30):
....: print get_memory_usage()
....: Q = QuadraticField(-D)
with the current commit. The memory consumption stays level after a short while, and moreover the numbers returned by get_memory_usage()
are now less than 190!
Result of make ptest
with the current commit:
sage -t src/sage/interfaces/maxima_abstract.py # 1 doctest failed
sage -t src/sage/matrix/matrix2.pyx # 1 doctest failed
sage -t src/sage/categories/group_algebras.py # Killed due to segmentation fault
sage -t src/sage/combinat/symmetric_group_algebra.py # Killed due to segmentation fault
sage -t src/sage/combinat/ncsf_qsym/ncsf.py # Killed due to segmentation fault
sage -t src/sage/combinat/words/morphism.py # 10 doctests failed
sage -t src/sage/schemes/toric/morphism.py # 4 doctests failed
sage -t src/sage/matrix/matrix0.pyx # 3 doctests failed
sage -t src/sage/categories/pushout.py # Killed due to segmentation fault
sage -t src/sage/rings/polynomial/plural.pyx # 11 doctests failed
sage -t src/sage/libs/singular/function.pyx # 2 doctests failed
sage -t src/sage/structure/coerce.pyx # 5 doctests failed
sage -t src/sage/libs/singular/groebner_strategy.pyx # 2 doctests failed
So, there remains a lot of work to do.
Replying to @simon-king-jena:
I think this is not the case with the branch that I have just uploaded. I did
Got it! Thanks. To have documented why we're OK here: _coerce_from_hash
is a MonoDict
, so it holds a strong reference to the map stored. This is what keeps the map alive. However, when the domain vanishes, then the MonoDict
callback will remove the entry and hence the strong reference to the map. This makes it possible for the map to be GCed. So the required callback to clean up is already triggered via the key triple.
What weakrefs are you talking about? Those to domain and codomain? And what would the callback be supposed to do?
Indeed, domain and codomain. So those callbacks are already taken care of by _coerce_from_hash
being a MonoDict
. The codomain is actually not relevant for this (and would be strongly referenced by the map anyway -- perhaps "weakened" maps should only have their domain reference weakened and parent (reference to the homset) cleared? If we just always keep codomain strongly referenced compositions would naturally keep things alive guaranteed anyway. Virtually all maps have a strong ref to the codomain (via generator images) internally anyway, so putting it explicitly on the outside shouldn't hurt much.
A different story is
_coerce_from_list
, which is a list. Here, we might need to take care of defunct maps. Perhaps the maps there shouldn't be weakened in the first place (_coerce_from_list
is filled byregister_coercion()
, and perhaps it would make sense to keep these maps strong), but I am not sure if this wouldn't re-introduce theQuadraticField
leak I have just fixed.
Do we really need _coerce_from_list
? Have we identified what it does that cannot be accomplished with the data stored in _coerce_from_hash
?
As I have stated above, I don't think there is a problem with
_coerce_from_hash
accumulating garbage. But_coerce_from_list
may contain garbage, of limited size, because it is only added to by explicit calls toregister_coercion()
.I think a better strategy would be to make
discover_coercion
check whether it meets a defunct map when it performs the backtrack algorithm.
I doubt you could prove about such an incidental "bumping in" strategy that the number of defunct maps is bounded linearly in the number of active stuff (perhaps measured by active entries in _coerce_from
?). That means you wouldn't be able to prove that you don't have a memory leak, and I suspect that with a little study, one could concoct an example that exhibits the leak as well. If discover_coercion
would regularly visit all entries in _coerce_from_list
then the process would be way too slow.
Given how delicate this code is, please do put ample documentation about assumptions and strategies in the code. Otherwise, the next person to work on this code will likely screw things up. Hopefully it'll be cought by doctests, but we've seen that this could easily not happen.
So one action point: Have you considered just leaving the codomain strongly referenced? Apart from the fact that _make_weak
and _make_strong
on map compositions still needs to recurse into components, it should make compositions easier to work with, because the first map will keep the middle (co)domain alive, as desired.
It also means _codomain
can remain a fast slot.
I currently don't see a scenario where weakening the codomain reference buys us anything, especially because maps internally will usually reference the codomain.
Another action point: see if _coerce_from_list
can simply be thrown away.
Replying to @nbruin:
Indeed, domain and codomain. So those callbacks are already taken care of by
_coerce_from_hash
being aMonoDict
. The codomain is actually not relevant for this (and would be strongly referenced by the map anyway -- perhaps "weakened" maps should only have their domain reference weakened and parent (reference to the homset) cleared? If we just always keep codomain strongly referenced compositions would naturally keep things alive guaranteed anyway. Virtually all maps have a strong ref to the codomain (via generator images) internally anyway, so putting it explicitly on the outside shouldn't hurt much.
This might actually be a good idea. Note that some wise person decided to store coerce maps on the codomain. Hence, having a strong reference from the map back to the codomain will not hurt at all, with the additional advantage you just mentioned. It would also fix the problem with composed maps and with PrecomposedAction
!
Do we really need
_coerce_from_list
? Have we identified what it does that cannot be accomplished with the data stored in_coerce_from_hash
?
I don't know why it had originally been introduced. But when I added the weak versions of triple and mono dict, I actually thought of it as a way to make some coercions (namely those explicitly registered) permanent.
I doubt you could prove about such an incidental "bumping in" strategy that the number of defunct maps is bounded linearly in the number of active stuff (perhaps measured by active entries in
_coerce_from
?). That means you wouldn't be able to prove that you don't have a memory leak, and I suspect that with a little study, one could concoct an example that exhibits the leak as well.
Sure. But the more you have to struggle to construct a leak, the more happy I'll be... :-P
Given how delicate this code is, please do put ample documentation about assumptions and strategies in the code. Otherwise, the next person to work on this code will likely screw things up. Hopefully it'll be cought by doctests, but we've seen that this could easily not happen.
OK, I'll try to be explicit in either the comments in the code, or in the docs.
I can confirm that with a strong reference to the codomain, the leak is still fixed. Now trying to see if it also prevents some of the crashes I've seen in the tests!
Outch. Sage started, but the first four tests I've run have segfaulted. Very bad. How can that be?
Replying to @simon-king-jena:
Outch. Sage started, but the first four tests I've run have segfaulted. Very bad. How can that be?
This is really getting scary. I see in an interactive session that the leak is fixed, but according to the corresponding doctest it isn't fixed.
Replying to @simon-king-jena:
This is really getting scary. I see in an interactive session that the leak is fixed, but according to the corresponding doctest it isn't fixed.
What a fun...
When counting the number of quadratic number fields tracked by gc, I did
numberQuadFields = len([x for x in gc.get_objects() if isinstance(x, C)])
And apparently,just by chance, x became a pointer to the number field that I wanted to delete. Hence, I had to add del x
to make the tests pass.
I am now back at trying to trace down the errors mentioned in comment:68. Frustratingly, with my current local branch (not yet posted), I can not reproduce all the errors, even though the code during comment:68 looked nearly the same.
One rather puzzling error:
sage: import gc
sage: SG4 = SymmetricGroupAlgebra(ZZ,4)
sage: gc.collect()
1051
sage: SG4(1).is_central()
------------------------------------------------------------------------
./sage: Zeile 134: 14576 Speicherzugriffsfehler "$SAGE_ROOT/src/bin/sage" "$@"
Why is this puzzling? Well, it does not occur when I make both domain and codomain weak references, for coerce maps. But it does occur when I only make the domain a weak reference.
So, is it the case that "the more weak references, the more crash safe"??
Replying to @simon-king-jena:
numberQuadFields = len([x for x in gc.get_objects() if isinstance(x, C)])
Perhaps use
numberQuadFields = sum(1 for x in gc.get_objects() if isinstance(x, C))
instead, to prevent leaking x?
It reduces to
sage: SG4 = SymmetricGroupAlgebra(ZZ,4)
sage: SG4._introspect_coerce()['_coerce_from_list']
[Generic morphism:
From: Integer Ring
To: Symmetric group algebra of order 4 over Integer Ring,
Generic morphism:
From: Symmetric group algebra of order 3 over Integer Ring
To: Symmetric group algebra of order 4 over Integer Ring]
sage: phi, psi = _
sage: import gc
sage: gc.collect()
1062
sage: phi
Generic morphism:
From: Integer Ring
To: Symmetric group algebra of order 4 over Integer Ring
sage: psi
Traceback (most recent call last):
...
ValueError: This map is in an invalid state, domain or codomain have been garbage collected
So, perhaps we need to keep stuff strong in the coerce_from_list after all.
PS: But why does it not crash if the coerce maps' codomains get weakref'd too?
For the record: When I do not weaken the maps on _coerce_from_list
or _convert_from_list
, the crash vanishes, but the memory leak is still fixed.
Also in my unpublished branch: Skip invalid maps in discover coercion (in the backtracking algorithm). But this is just an additional safety. I think we want that explicitly registered maps (i.e., those which are the fundamental path ways in the backtracking algorithm) will stay alive. Hence, it might even be worth while to keep an explicit reference to the domain of the map, so that the map will remain valid even if something weakens it.
I am now running tests again.
To summarise the requirements found in our discussion for a stable and memory friendly coercion system:
P.register_coercion(mor)
are the backbone of the discover_coercion algorithm. Hence, they need to be kept healthy as long as P
lives.To summarise how I think we can meet these requirements:
P.register_coercion(mor)
keeps a strong reference to mor.domain()
in a new attribute P._registered_domains
(which is a list). All other maps in the coercion system will be weakened.P.discover_coercion(Q)
in P._coerce_from_hash
, which is a monodict. In particular, if Q will be garbage collected, then phi is immediately removed from the cache, and thus the strong reference of phi to the codomain P will not prevent P from garbage collection. And if P will be garbage collected, then the whole cache including phi is removed. However, if both P and Q live, then phi will stay in P._coerce_from_hash[Q]
, and it will stay healthy, because its domain Q is still alive (and hence the weak reference is still ok).I think this model makes sense.
And concerning a safe user interface: Perhaps we can do without. Namely, we could make it so that the string representation of a weakened map will consist of a warning, suggesting to use a copy. In this way, it would be sufficiently likely that the user wouldn't have too many bad surprises.
Replying to @simon-king-jena:
- Maps registered by
P.register_coercion(mor)
are the backbone of the discover_coercion algorithm. Hence, they need to be kept healthy as long asP
lives.
It's not clear to me this is the case. If mor
is a map from S to P then P.register_coercion(mor)
ensures the coercion system knows how to map an element from S into P. Why is it important to maintain this information for the lifetime of P? If S ceases to exist, then one would not need to map elements from S to P. Why do we need to ensure S keeps existing? Shouldn't P have a reference to S independent of the coercion system if P's health depends on the existence of S? I think the coercion system is there to keep track of relations between existing objects, not to keep objects in existence (although this might be a side-effect of keeping track of relations between other objects).
- If P has a coerce embedding then this embedding needs to be kept healthy as long as P lives.
I presume this is the function of the _embedding
coercion attribute? So it looks like we already have that. What if we want multiple embeddings?
P.register_coercion(mor)
keeps a strong reference tomor.domain()
in a new attributeP._registered_domains
(which is a list). All other maps in the coercion system will be weakened.
So what means are there available to add a coercion without tying the lifespan of one to the other? Shouldn't it be possible to specify such things too?
- We store the map phi found by
P.discover_coercion(Q)
inP._coerce_from_hash
, which is a monodict. In particular, if Q will be garbage collected, then phi is immediately removed from the cache, and thus the strong reference of phi to the codomain P will not prevent P from garbage collection.
It never did prevent collection: The reference to phi is held by P, so the reference from phi to P would be recognized as a cycle, so the cyclic GC would find it (and parents usually are involved in cycles already, so cyclic GC is their only chance for being reclaimed anyway). Your statement is still true.
And concerning a safe user interface: Perhaps we can do without. Namely, we could make it so that the string representation of a weakened map will consist of a warning, suggesting to use a copy. In this way, it would be sufficiently likely that the user wouldn't have too many bad surprises.
Hm, we'll see.
With my current not yet published code, I get errors only in one file, namely with scheme morphisms. And this actually isn't a surprise, since scheme morphisms almost completely ignore the existing methods they inherit from sage.categories.map.Map
. They have a custom __init__
doing more or less the same what the old version of Map.__init__
used to do, and they even override domain()
and codomain()
.
So, this should be fixable, and I suppose I will be able to post a commit so that all tests pass, later today.
Replying to @nbruin:
Replying to @simon-king-jena:
- Maps registered by
P.register_coercion(mor)
are the backbone of the discover_coercion algorithm. Hence, they need to be kept healthy as long asP
lives.It's not clear to me this is the case. If
mor
is a map from S to P thenP.register_coercion(mor)
ensures the coercion system knows how to map an element from S into P. Why is it important to maintain this information for the lifetime of P? If S ceases to exist, then one would not need to map elements from S to P.
One does! Namely, suppose that you first do P.register_coercion(mor)
, then you allow S
to become garbage collected, and then you create a new parent Q
with an embedding into what looks like S
(of course, it now is a replica of the original now garbage collected S
).
You would want that Q
coerces into P
via S
, by transitivity of coercions. But you collected mor.domain()
and thus you will fail to find this coercion.
Another attempt to explain why I think the explicitly registered maps need to be kept:
Think of the coercions as a digraph.
.register_coercion(...)
.register_coercion
, the coercion system may find short-cuts by calling ._coerce_map_from_(...)
. For efficiency reasons, these short-cuts are stored in a cache.How does the coercion system find "directed paths with short-cuts" from Q to P?
Remark
Perhaps in the long run, we could actually use a proper digraph (with a lightning fast backend) to keep track of coercions.
Now for garbage collection:
Why do we need to ensure S keeps existing? Shouldn't P have a reference to S independent of the coercion system if P's health depends on the existence of S? I think the coercion system is there to keep track of relations between existing objects, not to keep objects in existence (although this might be a side-effect of keeping track of relations between other objects).
I think that in many cases it would indeed be the case that P should reference S independent of coercions (say, if S is the base ring of P). However, it is conceivable (and I think I've met such cases in failing doctests) that P can remain perfectly healthy without S, but still we would like to find coercions from Q to P via S.
So, the coercion system will not care for the health of P, but it must care for the connectivity of the coercion graph. And in the current algorithm sketched above, it is of vital importance to keep arrows leading to P alive, since otherwise we have to live with shortcuts only.
- If P has a coerce embedding then this embedding needs to be kept healthy as long as P lives.
I presume this is the function of the
_embedding
coercion attribute? So it looks like we already have that. What if we want multiple embeddings?
Then we look into the code and see that multiple embeddings are explicitly excluded. A parent has precisely one embedding that is taken into account by the coercion search in forward direction. You may of course register further embeddings as coercions, by emb.codomain().register_coercion(emb)
, but they would only be used for search in backward direction.
P.register_coercion(mor)
keeps a strong reference tomor.domain()
in a new attributeP._registered_domains
(which is a list). All other maps in the coercion system will be weakened.So what means are there available to add a coercion without tying the lifespan of one to the other?
Note that after P.register_coercion(mor)
, mor.domain()
will live at least as long as P. But mor.domain()
would not be enough to keep P alive.
And there surely is a means available to add a coercion that doesn't tie the lifespan of two parents too closely: Implement P._coerce_map_from_(Q)
, which can return a map or simply "True" (in the latter case, conversion is used to coerce Q into P). The result is cached in P._coerce_from_hash
, but not in P._coerce_from_list
.
Shouldn't it be possible to specify such things too?
Perhaps in the long run? I wouldn't like to do those things (such as: Rewrite the coercion system to use a proper digraph backend) here.
Replying to @simon-king-jena:
With my current not yet published code, I get errors only in one file, namely with scheme morphisms. And this actually isn't a surprise, since scheme morphisms almost completely ignore the existing methods they inherit from
sage.categories.map.Map
. They have a custom__init__
doing more or less the same what the old version ofMap.__init__
used to do, and they even overridedomain()
andcodomain()
.
Arrgh, it is even worse: SchemeMorphism
just inherits from Element
, not from Map
! Unbelievable.
Author: Simon King
I have pushed two more commits, and now all doctests should pass (to be verified).
I also think I have extensively explained the rationale behind the changes. Hence, I have already clicked "Needs review". But it was too soon (see below).
A potential "todo": We might want that a weakened map returned by P.coerce_map_from(Q)
will print as
sage: QQ['x'].coerce_map_from(QQ)
This is a map used by the coercion system.
If you want to use it outside of the coercion system,
please use a copy of the map
sage: copy(_)
Polynomial base injection morphism:
From: Rational Field
To: Univariate Polynomial Ring in x over Rational Field
In this case, we may want to have __copy__
methods everywhere. Even though the current branch improves copying, I just obtained the following crash that I want to fix first:
sage: QQ['x'].coerce_map_from(ZZ)
Composite map:
From: Integer Ring
To: Univariate Polynomial Ring in x over Rational Field
Defn: Natural morphism:
From: Integer Ring
To: Rational Field
then
Polynomial base injection morphism:
From: Rational Field
To: Univariate Polynomial Ring in x over Rational Field
sage: phi = copy(_)
sage: phi(2)
<SEGFAULT>
Work Issues: Provide copy of composed maps
Done, and now I'd say it can be reviewed. With the latest commit, one can do
sage: phi = QQ['x'].coerce_map_from(ZZ)
sage: phi.domain
<weakref at 0xa225284; to 'sage.rings.integer_ring.IntegerRing_class' at 0x96ddd3c (EuclideanDomains.parent_class)>
sage: type(phi)
<type 'sage.categories.map.FormalCompositeMap'>
sage: psi = copy(phi)
sage: psi
Composite map:
From: Integer Ring
To: Univariate Polynomial Ring in x over Rational Field
Defn: Natural morphism:
From: Integer Ring
To: Rational Field
then
Polynomial base injection morphism:
From: Rational Field
To: Univariate Polynomial Ring in x over Rational Field
sage: psi(3)
3
sage: psi.domain
The constant function (...) -> Integer Ring
which required implementing __copy__
for polynomial basering injections and formal composite maps.
If you want me to make weakened coercion maps print as a big warning, then I could of course do so, but will only do if you ask.
Too bad. Apparently the elliptic curve code does not like SchemeMorphism
to be a Morphism
...
Changed work issues from Provide copy of composed maps to Fix elliptic curves code
Namely:
sage: E=EllipticCurve('37a1')
sage: P=E(0,0)
sage: Q=5*P
<Booom>
apparently while trying to create an action.
Aha.
sage: P.__class__.mro()
[sage.schemes.elliptic_curves.ell_point.EllipticCurvePoint_number_field,
sage.schemes.elliptic_curves.ell_point.EllipticCurvePoint_field,
sage.schemes.projective.projective_point.SchemeMorphism_point_abelian_variety_field,
sage.structure.element.AdditiveGroupElement,
sage.structure.element.ModuleElement,
sage.schemes.projective.projective_point.SchemeMorphism_point_projective_field,
sage.schemes.projective.projective_point.SchemeMorphism_point_projective_ring,
sage.schemes.generic.morphism.SchemeMorphism_point,
sage.schemes.generic.morphism.SchemeMorphism,
sage.categories.morphism.Morphism,
sage.categories.map.Map,
sage.structure.element.Element,
sage.structure.sage_object.SageObject,
object]
and
sage: P.domain
<bound method EllipticCurvePoint_number_field.domain of (0 : 0 : 1)>
sage: P.domain.__module__
'sage.schemes.elliptic_curves.ell_point'
sage: P.domain??
Type: instancemethod
String Form:<bound method EllipticCurvePoint_number_field.domain of (0 : 0 : 1)>
File: /home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/schemes/elliptic_curves/ell_point.py
Definition: P.domain(self)
Source:
def domain(self):
"""
Return the domain of this point, which is `Spec(F)` where `F` is
the field of definition.
EXAMPLES::
sage: E=EllipticCurve(QQ,[1,1])
sage: P=E(0,1)
sage: P.domain()
Spectrum of Rational Field
sage: K.<a>=NumberField(x^2-3,'a')
sage: P=E.base_extend(K)(1,a)
sage: P.domain()
Spectrum of Number Field in a with defining polynomial x^2 - 3
"""
return self.parent().domain()
So, not only was SchemeMorphism
ignoring Morphism
, but additionally its sub-class EllipticCurvePoint_fieldEllipticCurvePoint_field
overrode what it inherited from SchemeMorphism
by a verbosely identical copy.
The elliptic curve code is a mess.
In sage.schemes.elliptic_curves.heegner
, it also seems to me that GaloisAutomorphism
should use Morphism
. I don't know if I will change it here.
Hmm. If we are in very bad luck, then the current commit triggers a bug in Cython concerning confusion of different cpdef slots when one provides two different base classes (ModuleElement
and Morphism
). Namely, it pretty much seems that
(<ModuleElement>left)._add_(<ModuleElement>right)
does not return
left._add_(right)
Hooray!
----------------------------------------------------------------------
All tests passed!
----------------------------------------------------------------------
Total time for all tests: 3777.0 seconds
cpu time: 5867.6 seconds
cumulative wall time: 7256.6 seconds
I added a "todo" to the documentation of sage.schemes.generic.morphism
, stating that SchemeMorphism
should rather inherit from Morphism
, but currently can not, because of a bug in Cython. I modified the code so that it now exactly imitates the new Morphism
code. In this way, a future transition to a sub-class of Morphism
should be easier.
Anyway, since the tests pass, I'd say it is "needs review" for now.
Plan: If you believe that the new approach makes sense, then I'll also add a longer section to the reference manual, explaining how the weakref business is supposed to work.
By the way, I was looking into my logs/ptest.log, and it seems that the total cpu time for the tests remained essentially the same. Hence, hopefully there is no bad slow-down. But we should probably take some old benchmarks used at, say, #11900, and see what my new commits do with them.
Here are the tests from #11900.
In Commit:05fb569, I get
sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 10.76 s, sys: 0.18 s, total: 10.93 s
Wall time: 10.95 s
sage: %time TestSuite(CrystalOfTableaux(['B',4],shape=[2,1,1,0])).run()
CPU times: user 3.12 s, sys: 0.00 s, total: 3.12 s
Wall time: 3.13 s
sage: W.<z> = CyclotomicField(13)
sage: %time M = Matrix(W, 2, 3, [10^30*(1-z)^13, 1, 2, 3, 4, z]).echelon_form()
CPU times: user 2.28 s, sys: 0.02 s, total: 2.29 s
Wall time: 2.30 s
sage: %time L = EllipticCurve('960d1').prove_BSD()
CPU times: user 4.80 s, sys: 0.04 s, total: 4.84 s
Wall time: 4.85 s
sage: def test(E):
....: for p in prime_range(10000):
....: if p != 389:
....: G = E.change_ring(GF(p)).abelian_group()
....:
sage: E = EllipticCurve('389a')
sage: %time test(E)
CPU times: user 31.56 s, sys: 0.09 s, total: 31.65 s
Wall time: 31.70 s
sage: %time for E in cremona_curves([11..100]): S = E.integral_points(both_signs=False)
CPU times: user 21.96 s, sys: 0.06 s, total: 22.02 s
Wall time: 22.07 s
In public/sage-git/master, I get
sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 10.72 s, sys: 0.17 s, total: 10.90 s
Wall time: 10.91 s
sage: %time TestSuite(CrystalOfTableaux(['B',4],shape=[2,1,1,0])).run()
CPU times: user 3.23 s, sys: 0.00 s, total: 3.23 s
Wall time: 3.24 s
sage: W.<z> = CyclotomicField(13)
sage: %time M = Matrix(W, 2, 3, [10^30*(1-z)^13, 1, 2, 3, 4, z]).echelon_form()
CPU times: user 2.21 s, sys: 0.02 s, total: 2.23 s
Wall time: 2.23 s
sage: %time L = EllipticCurve('960d1').prove_BSD()
CPU times: user 5.27 s, sys: 0.04 s, total: 5.32 s
Wall time: 5.32 s
sage: def test(E):
....: for p in prime_range(10000):
....: if p != 389:
....: G = E.change_ring(GF(p)).abelian_group()
....:
sage: E = EllipticCurve('389a')
sage: %time test(E)
CPU times: user 32.06 s, sys: 0.11 s, total: 32.17 s
Wall time: 32.22 s
sage: %time for E in cremona_curves([11..100]): S = E.integral_points(both_signs=False)
CPU times: user 22.11 s, sys: 0.07 s, total: 22.17 s
These tests have been found sensitive against slowness in category and coercion framework. So, the fact that there is not the faintest regression in these tests gives some confidence.
Replying to @simon-king-jena:
One does! Namely, suppose that you first do
P.register_coercion(mor)
, then you allowS
to become garbage collected, and then you create a new parentQ
with an
embedding into what looks like S
(of course, it now is a replica of the original now garbage collected S
).
You would want that
Q
coerces intoP
viaS
, by transitivity of coercions. But you collectedmor.domain()
and thus you will fail to find this coercion.
I agree that there is a place for such strong connections, but I have severe reservations about declaring it's the only way or even the default way to inform the system about coercions.
The following would leak memory with the model you propose:
M1=ZZ^1
M2=ZZ^2
for d in [2..100000]:
N=ZZ^d
m1=Hom(N,M1)([M.0 for i in range(d)])
m2=Hom(N,M2)([M.(i % 2) for i in range(d)])
M1.register_coercion(m1)
M2.register_coercion(m2)
I have severe reservations about declaring that this code "will never be memory efficient" in sage. One way out is to use the (for this purpose) extremely inappropriately named
N.register_embedding(m1)
instead, which in your model would mean M1 lives at least as long as N instead. In addition, we would not be able to do this for both M1
and M2
.
I think there's room for a register_embedding
with those semantics (although it should probably have a different name, because it doesn't have to be an embedding)
as well as for register_coercion_from_while_keeping_domain_alive
with a catchier name, but the life-enhancing side effects are fundamentally separate from the fact
that there's a coercion in the first place, and I think it should be possible to register a non-life-enhancing coercion as well (the main reason is that
it's easily done and that I haven't seen a proof it's not necessary. We don't want to unnecessarily hamstring the system).
philosophical ramblings
The following observations are the result of trying to come up with a conceptual ideal/model that we could use to argue what our coercion framework SHOULD do. Up to now it seems to me people have mainly been making ad-hoc, gut feeling decisions about how to put together coercion. I haven't found a better solution here, but I think there are some examples may illustrate we might just have to be pragmatic about this.
Your suggestion makes sense if you want to model a "permanent, unchanging universe" in which all possible parents, with consistent coercions, already exist; we're simply "discovering" this universe in a piecemeal fashion. It's a common model in modern algebra, but I think our computer algebra model is too far away from this ideal to follow this model too far. Consider:
Qx.<x>=QQ[]
K.<a>=NumberField(x^4-2)
L.<b>=NumberField(x^2-2,embedding=a^2)
This fits perfectly in the "unchanging universe" model. Also note that the coercion system does not need to let L keep K alive, since the construction parameters,
which get kept alive for the life of L by CachedRepresentation
or something analogous, refer to K already.
Now consider
M.<b>=NumberField(x^2-2)
In the "unchanging universe" (and in sage as well) we have that M is distinct from L. However, I think it's unrealistic to expect that all "embeddings" etc. can be specified at construction time. So I think, even though it's not possible currently in sage, that one should allow for
m1=Hom(M,K)([a^2])
m2=Hom(M,K)([-a^2])
M.register_embedding(m1)
Note that the choice of m1
or m2
here leads to different relations between M
and K
and hence different universes.
In other words, our concept of "globally unique" is not powerful enough to capture the
full identity of objects, which would include the coercion relations with objects that haven't been "discovered" yet. In practice, we can usually work around that by
for instance changing the names of generators and hence create artificially differently labelled objects but that's already not a possibility for creating
different copies of ZZ^n
, since there are no generator names to choose there.
I think one has to accept the reality here: what we have is a collection of objects whose relations do change in time.
Some particular responses (most of them direct corollaries from what is observed above).
Another attempt to explain why I think the explicitly registered maps need to be kept:
Think of the coercions as a digraph.
- The vertices correspond to parents.
- The arrows (oriented edges) of the digraph correspond to those maps that are declared as coercions by
.register_coercion(...)
.- In addition to the arrows created with
register_coercion
, the coercion system may find short-cuts by calling._coerce_map_from_(...)
. For efficiency reasons, these short-cuts are stored in a cache.- The coercion system then tries to find directed paths between two given vertices, including short-cuts. For efficiency reasons, it stores the resulting composed maps in a cache.
That's not the only thing coercion does. It may also find "common covering structures", which may lead to construction of new parents. Those definitely don't deserve to get nailed into memory. Yet, the code that creates these parents will look (to the coercion system) as a "user", so it would be using these strong-referencing coercion registration routines.
So, the coercion system will not care for the health of P, but it must care for the connectivity of the coercion graph. And in the current algorithm sketched above, it is of vital importance to keep arrows leading to P alive, since otherwise we have to live with shortcuts only.
I think it may well be a feature, not a bug, that one at some point can just be left with the shortcuts and that the intermediates have fallen out. The natural way of staying close to "discovering a permanent universe" is by never throwing away anything that has been discovered, and I think we agree that's a "memory leak". So it's really a matter of only keeping things around that we still need. If we're too lax with considering what is "needed", we'll end up keeping too many things in memory.
Then we look into the code and see that multiple embeddings are explicitly excluded. A parent has precisely one embedding that is taken into account by the coercion
search in forward direction. You may of course register further embeddings as coercions, by emb.codomain().register_coercion(emb)
, but they would only be used
for search in backward direction.
Not only that: you'd be tying different lifetime implications to that construction too.
And there surely is a means available to add a coercion that doesn't tie the lifespan of two parents too closely: Implement
P._coerce_map_from_(Q)
, which can return a map or simply "True" (in the latter case, conversion is used to coerce Q into P). The result is cached inP._coerce_from_hash
, but not inP._coerce_from_list
.
You mean: implement P._install_coerce_map_from_(m)
, which does: _coerce_from_hash[Domain(m)]=m
. I think it is quite important to be able to manipulate the
coercion graph without having to modify library code.
Congratulations on making such good headway! The main thing left is to determine how weakly referenced the coercion framework needs to be.
The following quickly eats up memory:
(This is with 5.10.rc0)
Problem analysis
The quadratic field is created with a coerce embedding into
CLF
. At the same time, this coerce embedding is stored inCLF._coerce_from_hash
:The "coerce_from_hash" is a
MonoDict
, hence, has only a weak reference to the key (Q, in this case). However, there still is a strong reference from CLF to the coerce map phi. And phi has a strong reference to its domain, thus, to Q. Hence, the existence of CLF prevents garbage collection of Q.And there is a second chain of strong references from CLF to Q: From CLF to phi to the parent of phi (i.e., a homset) to the domain Q of this homset.
Suggested solution
We can not turn the reference from CLF to phi into a weak reference, because then even a strong reference to Q would not prevent phi from garbage collection. Hence, we need to break the above mentioned reference chains in two points. In the attached branch, maps generally keep a strong reference to the codomain (this is important in composite maps and actions), but those used in the coercion system (and only there!!) will only have a weak reference to the domain, and they set the cdef
._parent
attribute toNone
(hence, we also override.parent()
, so that it reconstructs the homset if the weak reference to the domain is still valid).To preserve the
domain()/codomain()
interface, I have removed the methoddomain()
and have replaced it by a cdef public attribute that will either hold a weak reference (which returns the domain when called, hence, the interface does not change) or aConstantFunction
(which should actually be faster to call than a method). Since accessing a cdef attribute is still faster, the cdef attribute_codomain
is kept (since this will always be a strong reference), but_domain
has been removed.This "weakening of references" is done for the coercions found by
discover_coerce_map_from()
stored into_coerce_from_hash
. So, this mainly happens for things done with_coerce_map_from_()
and with composite maps. Similarly for_convert_from_hash
.Weakening is not used on the maps that are explicitly registered by
.register_embedding()
and.register_coercion()
. This is in order to preserve the connectivity of the coercion graph. Theregister_*
methods are only used on selected maps, that are of particular importance for the backtrack search indiscover_coerce_map_from()
. These strong registrations do not propagate: Compositions of strongly registered coercions found bydiscover_coerce_map_from()
will be weakened.Since weakened maps should not be used outside of the coercion system, its string representation shows a warning to replace them by a copy. The attached branch implements copying of maps in some additional cases.
SchemeMorphism
can not inherit fromMorphism
, because of a bug with multiple inheritance of a Python class from Cython extension classes. But once this bug is fixed, we surely want to makeSchemeMorphism
inherit fromMorphism
. This transition is prepared here.Weakened maps should only be used in the coercion system: A weakened map can become invalid by garbage collection, and the coercion system has the job to remove a map from the coercion cache as soon as it becomes invalid.
Maps outside of the coercion system should be safe against invalidation. Hence, when we take a coerce map, then we should better create a non-weakened copy. The branch also provides copying (and pickling) for all kinds of maps and morphisms (hopefully no map/morphism class went unnoticed).
In any case, the commit messages should give a concise description of what has been done.
TODO in future tickets
.register_coercion()
that weakens the coercion map. It would hence have the same effect as returning a map by._coerce_map_from_()
, but of course._coerce_map_from()
could not easily be changed in an interactive session.Effects on the overall functioning of Sage
It is conceivable that some parts of Sage still suppose implicitly that stuff cached with
UniqueRepresentation
is permanently cached, even though the seemingly permanent cache was not more than a consequence of a memory leak in the coercion system. With the attached branch, garbage collection of parent structures will much more often become possible. Hence, code that relied on a fake-permanent cache would now need to create the same parent repeatedly.I (Simon) have tested how many additional parent creations occur with the attached branch when running
sage -t --all
. The findings are summarised in comment:107: The number of additional parent creations increased by not more than 1% for all but two parent classes (both related with tableaux). I also found that the time to run the tests did not significantly increase.Jean-Pierre has occasionally stated that some of his computations have been infeasible with the memory leak in the above example. I hope that his computations will now succeed.
CC: @simon-king-jena @nbruin @nthiery @anneschilling @zabrocki
Component: number fields
Keywords: QuadraticField
Author: Simon King, Travis Scrimshaw, Jean-Pierre Flori
Branch:
00b3e2f
Reviewer: Nils Bruin, Jean-Pierre Flori
Issue created by migration from https://trac.sagemath.org/ticket/14711