Weak references in the coercion graph

sagemath / sage

Main repository of SageMath

https://www.sagemath.org

Other

1.33k stars 453 forks source link

Weak references in the coercion graph #14711

Closed jpflori closed 10 years ago

jpflori commented 11 years ago

The following quickly eats up memory:

sage: for D in xrange(2,2**32):
....:     QuadraticField(-D);
....:

(This is with 5.10.rc0)

Problem analysis

The quadratic field is created with a coerce embedding into CLF. At the same time, this coerce embedding is stored in CLF._coerce_from_hash:

sage: phi = CLF.coerce_map_from(Q)
sage: phi is Q.coerce_embedding()
True
sage: Q in CLF._introspect_coerce()['_coerce_from_hash']
True

The "coerce_from_hash" is a MonoDict, hence, has only a weak reference to the key (Q, in this case). However, there still is a strong reference from CLF to the coerce map phi. And phi has a strong reference to its domain, thus, to Q. Hence, the existence of CLF prevents garbage collection of Q.

And there is a second chain of strong references from CLF to Q: From CLF to phi to the parent of phi (i.e., a homset) to the domain Q of this homset.

Suggested solution

We can not turn the reference from CLF to phi into a weak reference, because then even a strong reference to Q would not prevent phi from garbage collection. Hence, we need to break the above mentioned reference chains in two points. In the attached branch, maps generally keep a strong reference to the codomain (this is important in composite maps and actions), but those used in the coercion system (and only there!!) will only have a weak reference to the domain, and they set the cdef ._parent attribute to None (hence, we also override .parent(), so that it reconstructs the homset if the weak reference to the domain is still valid).

To preserve the domain()/codomain() interface, I have removed the method domain() and have replaced it by a cdef public attribute that will either hold a weak reference (which returns the domain when called, hence, the interface does not change) or a ConstantFunction (which should actually be faster to call than a method). Since accessing a cdef attribute is still faster, the cdef attribute _codomain is kept (since this will always be a strong reference), but _domain has been removed.

This "weakening of references" is done for the coercions found by discover_coerce_map_from() stored into _coerce_from_hash. So, this mainly happens for things done with _coerce_map_from_() and with composite maps. Similarly for _convert_from_hash.

Weakening is not used on the maps that are explicitly registered by .register_embedding() and .register_coercion(). This is in order to preserve the connectivity of the coercion graph. The register_* methods are only used on selected maps, that are of particular importance for the backtrack search in discover_coerce_map_from(). These strong registrations do not propagate: Compositions of strongly registered coercions found by discover_coerce_map_from() will be weakened.

Since weakened maps should not be used outside of the coercion system, its string representation shows a warning to replace them by a copy. The attached branch implements copying of maps in some additional cases.

SchemeMorphism can not inherit from Morphism, because of a bug with multiple inheritance of a Python class from Cython extension classes. But once this bug is fixed, we surely want to make SchemeMorphism inherit from Morphism. This transition is prepared here.

Weakened maps should only be used in the coercion system: A weakened map can become invalid by garbage collection, and the coercion system has the job to remove a map from the coercion cache as soon as it becomes invalid.

Maps outside of the coercion system should be safe against invalidation. Hence, when we take a coerce map, then we should better create a non-weakened copy. The branch also provides copying (and pickling) for all kinds of maps and morphisms (hopefully no map/morphism class went unnoticed).

In any case, the commit messages should give a concise description of what has been done.

TODO in future tickets

Provide a documentation of the use of weak references in coercion, and of different ways of registering coercions, with their different impacts on garbage collecion.
Provide a version of .register_coercion() that weakens the coercion map. It would hence have the same effect as returning a map by ._coerce_map_from_(), but of course ._coerce_map_from() could not easily be changed in an interactive session.

Effects on the overall functioning of Sage

It is conceivable that some parts of Sage still suppose implicitly that stuff cached with UniqueRepresentation is permanently cached, even though the seemingly permanent cache was not more than a consequence of a memory leak in the coercion system. With the attached branch, garbage collection of parent structures will much more often become possible. Hence, code that relied on a fake-permanent cache would now need to create the same parent repeatedly.

I (Simon) have tested how many additional parent creations occur with the attached branch when running sage -t --all. The findings are summarised in comment:107: The number of additional parent creations increased by not more than 1% for all but two parent classes (both related with tableaux). I also found that the time to run the tests did not significantly increase.

Jean-Pierre has occasionally stated that some of his computations have been infeasible with the memory leak in the above example. I hope that his computations will now succeed.

CC: @simon-king-jena @nbruin @nthiery @anneschilling @zabrocki

Component: number fields

Keywords: QuadraticField

Author: Simon King, Travis Scrimshaw, Jean-Pierre Flori

Branch: 00b3e2f

Reviewer: Nils Bruin, Jean-Pierre Flori

Issue created by migration from https://trac.sagemath.org/ticket/14711

simon-king-jena commented 10 years ago

comment:51

Sorry, reduction is not to blame. Doing a garbage collection after the first line and then repeating the first line reproduces the crash. Hence, some parent is not correctly kept alive. Anyway, you can't see it, since I didn't post the code yet...

simon-king-jena commented 10 years ago

comment:52

Replying to @nbruin:

Yes of course. CDF.convert_map_from(Q) should return a copy equivalent to phi with strong references to domain and codomain. If the original phi is a composition of "weak" (coercion generated) maps then all the components of the returned phi should also be strengthened copies.

In other words, you do think that we should distinguish between underscore methods that are used internally in the coercion system and just return the maps, and an "official" interface that returns strong copies. Do I understand correctly?

Concerning compositions, I agree that the parent in the middle should be kept alive by the composed map (even if this map is in the coercion system, hence, domain and codomain are only weakly referenced): If the composed map is kept in memory, then we need to be able to apply the composition, and hence the "man in the middle" needs to be available.

nbruin commented 10 years ago

comment:53

Replying to @simon-king-jena:

In other words, you do think that we should distinguish between underscore methods that are used internally in the coercion system and just return the maps, and an "official" interface that returns strong copies. Do I understand correctly?

Yes, with your approach the maps stored and used in the internals of the coercion systems are not able to stay healthy on their own. They can only survive within a controlled environment. So you cannot let those maps escape into the wild (the royal society for prevention of cruelty to maps would probably have you arrested). I don't see another solution than making a version that is better prepared for the outside world.

Concerning compositions, I agree that the parent in the middle should be kept alive by the composed map (even if this map is in the coercion system, hence, domain and codomain are only weakly referenced): If the composed map is kept in memory, then we need to be able to apply the composition, and hence the "man in the middle" needs to be available.

Yes, you are correct. You might want to check if compositions do tend to occur in the coercion system. They would be quite painful to work with. The natural way of constructing them would be to have a containing map type with _domain,_codomain,_parent set appropriately, together with a sequence of maps. Those maps would normally be normal, healthy maps with their own strong references to their domains and codomains: a composition would hence carry internally a strong reference to both its domain and codomain (due to the first and last map in the sequence).

The generic way of making a "weak" version of a map would still lead to a map that (internally) keeps strong references to domain and codomain. You'd have to make a custom way of weakening this map. Would you weaken the first and last map in the sequence? Then for a composition of two maps, you'd have to separately keep a reference to the middle (co)domain. Or would you have half-weakened maps as well, that only have a weaked domain or codomain?

That makes me realize a further complication: the copying probably has to happen both ways. Before you prepare a map to become part of the coercion system, you'd have to make sure that the map you're holding is not referenced by anyone outside. Thus, you'd have to make sure that either the origin of the map is guaranteed (it's not a map that is referenced elsewhere--I think this will be impossible to verify in practice, since users can register arbitrary maps as coercions) or you have to make a copy before weakening it (if weakening is an in-place operation, as you proposed above), further upping the cost of coercion discovery. Otherwise, registering a coercion might have the side-effect of weakening a map that someone else is holding already.

(these are the kind of snowballing complications I was afraid of by making a separate type of map suitable for the coercion system)

simon-king-jena commented 10 years ago

comment:54

Replying to @nbruin:

Concerning compositions, I agree that the parent in the middle should be kept alive by the composed map (even if this map is in the coercion system, hence, domain and codomain are only weakly referenced): If the composed map is kept in memory, then we need to be able to apply the composition, and hence the "man in the middle" needs to be available.

Yes, you are correct. You might want to check if compositions do tend to occur in the coercion system.

They do frequently occur, because coercions are found by some kind of backtracking algorithm. But, somehow surprisingly, I don't got the impression that the crashes I am observing come from this.

Anyway, I agree that one should have a strong reference in the middle.

The generic way of making a "weak" version of a map would still lead to a map that (internally) keeps strong references to domain and codomain.

No. You would have weak reference to the domain and codomain, but a strong reference to the middle. A FormalCompositeMap, by the way, stores two maps __first and __second, and in my current experimental code I simply make __first.codomain a constant function (if it isn't already).

It could in principle mean that the composite map gets deallocated, while __first stays somewhere else in the coercion system, and now keeps a parent (namely the middle one) alive that may be collectable. I'll worry about it later...

That makes me realize a further complication: the copying probably has to happen both ways. Before you prepare a map to become part of the coercion system, you'd have to make sure that the map you're holding is not referenced by anyone outside. Thus, you'd have to make sure that either the origin of the map is guaranteed (it's not a map that is referenced elsewhere--I think this will be impossible to verify in practice, since users can register arbitrary maps as coercions)

I think the (unwritten, I am afraid) contract is that register_coercion() is only called in __init__. So, perhaps it should rather be _register_coercion(), to remove it from the user interface.

nbruin commented 10 years ago

comment:55

Replying to @simon-king-jena:

I think the (unwritten, I am afraid) contract is that register_coercion() is only called in __init__. So, perhaps it should rather be _register_coercion(), to remove it from the user interface.

Absolutely not! I think it's a very important feature that coercions can be discovered "lazily", i.e., be registered after the fact. It also means (but this is just a fact of life) that, while parents are supposed to be immutable, their relations in the (global state)! coercion graph can evolve over time. You could of course have a _register_coercion for internal use that mandates being passed a map with the promise no-one else will keep a reference to that map, but I'm pretty sure we have to keep an advertised register_coercion. You could ask sage-devel, of course.

At some point there was even an idea to have a context manager to temporarily modify the coercion graph:

K=QuadraticField(3)
with coercion(K.real_embeddings()[0]):
    print 1+K.0

leading to -0.732050807568877 (assuming the first embedding is the negative one). For basic computations these things are not so essential, but by the time you're a couple levels deep, e.g., you want to compute the archimedean period matrices of some abelian variety defined over a number field, letting the coercion framework to the required conversions might be the only quick way to actually get your data in the right ring. I think we don't want to take away that possibility.

nbruin commented 10 years ago

comment:56

Replying to @simon-king-jena:

It could in principle mean that the composite map gets deallocated, while __first stays somewhere else in the coercion system, and now keeps a parent (namely the middle one) alive that may be collectable. I'll worry about it later...

Hm. Any time you have different versions of the same map that may diverge in the strength with which they refer to one of their domain, codomain, parent, etc., you'll need to make a copy to accommodate the divergence in strength. So it probably makes sense to only have two levels: either domain,codomain, and parent are referenced strongly or domain and codomain are referenced weakly (and probably there's no reference to parent at all).

That means that "weak" map compositions in the coercion system need to have a strong reference to the middle domain. That shouldn't be too bad.

Apart from possible efficiency problems, I think this idea can be made to work. The main programming penalty is the added burden on writing new map classes: maps must be able to generate a copy of themselves. You could make this optional: maps unable to copy themselves would be just stored as-is in the coercion framework, with all the memory leak consequences this has. If it's cheap to figure out if maps are weak and/or are capable of generating a weak/strong version of themselves, we could just accommodate both: If a "weakened" map arrives into the coercion framework it can just be used as-is. If a normal map arrives, we see if it can be weakened. If so, we make a weakened copy and store that. Otherwise the map is used as-is.

If a map is about to be passed out of the coercion framework, we check if it's weakened. If not, we can just give out the map itself. Otherwise, we make a strengthened copy and give that. If making a strengthened copy is not possible, we'd have to raise an error.

Of course the main point whether this is acceptable is whether it can be done with little to no increase to overhead compared to now. Costs come at two points

map access: domain and codomain must always be accessed via the methods rather than the (lightning fast) cdeffed attributes _domain and _codomain, since they may be stored via weakref or not. Coercion lookup itself shouldn't really be affected, but coercion application might. If the parent of coercion maps is accessed in the process, we might see a slowdown due to that too.
coercion discovery and registration: This will likely involve copying maps now, so this might slow down considerably. This shouldn't really be an inner loop operation, but it may affect things: multimodular algorithms may create lots of, say, matrix rings over finite fields. The computation itself may be relatively cheap, so the construction of the parent and the registration of its coercions may be a significant part of the total cost.

I guess the only way to see whether this is worth it is by trying. At least the semantics are clear. I find it a little scary to weigh down the entire interface for "Maps" but if we can make it opt-in it's perhaps not too much of a burden. (We could just gradually upgrade map classes to be "weakenable" as we find them to be responsible for memory leaks)

simon-king-jena commented 10 years ago

comment:57

I just found something crazy: Apparently the rational field got garbage collected in one of the crashing examples. Hard to believe. But it is in something using modular symbols, which is very old code, predating the unique parent paradigma. So, could be that it calls the constructor of the rational field, rather than a factory or whatever makes the rational field unique.

simon-king-jena commented 10 years ago

comment:58

Replying to @simon-king-jena:

I just found something crazy: Apparently the rational field got garbage collected in one of the crashing examples. Hard to believe.

Fortunately I was mistaken: It is only the case that the __init__ method is called repeatedly on the same object QQ if you call RationalField() repeatedly. Namely, it is not using a classcall metaclass, but does the caching in __new__---but __init__ is always called after __new__, whether it is cached or not. That's one of the reasons for introducing classcall metaclass.

simon-king-jena commented 10 years ago

comment:59

Perhaps it would be a good idea to change RationalField to use ClasscallMetaclass, but of course on a new ticket. Anyway, it is not related to the problem here.

simon-king-jena commented 10 years ago

comment:60

Replying to @simon-king-jena:

Perhaps it would be a good idea to change RationalField to use ClasscallMetaclass, but of course on a new ticket.

I created #15247

nbruin commented 10 years ago

comment:61

Too bad. The idea as suggested doesn't actually solve the memory leak; it just makes it less severe (by a constant factor). The problem is: The weakened maps don't prevent their domain from being GCed, but after than happens they linger (now defunct) in _coerce_from. You'll see that even with your patch in, the example in the ticket description will still eat memory--just a little less quickly. You'll find that CDF._coerce_from_hash will contain a LOT of entries.

If we were to use a mix of _coerce_from and _coerce_to (somehow choosing which one to use) you wouldn't see this problem.

If we really want/need to, we could probably salvage the "weakened map" solution:

we could install a callback on the weakrefs. Note that the callback would have to be delivered to the codomain, so we wouldn't really have "weakened maps" as objects on their own: the callback on the weakref would make them specific to the data structure in which they reside. There is also the usual problem that weakref callback if very critical and we'd have to be very careful what we do there.
defunct maps are easy to recognize: they have a dead weakref in their domain. We could just periodically scrub _coerce_from for defunct maps. One possible strategy would be to keep a "reference size" for _coerce_from and every time we add an entry we check if it is now double the reference size. If it is, we trigger gc, scrub, and reset the reference size. This would at least keep the list bounded in size and hopefully limit the number of expensive scrubs we have to do (treating the bounds exponentially should lead to small amortized costs)

simon-king-jena commented 10 years ago

comment:62

I have pushed my branch. I wonder why this did not show up as a post on trac. Anyway, I checked that when clicking on "Commits" then everything is there.

I hesitated to push it before, because it is not ready for review. However, it would be better if we'd all agree what code we are talking about.

Replying to @nbruin:

Too bad. The idea as suggested doesn't actually solve the memory leak; it just makes it less severe (by a constant factor).

I think this is not the case with the branch that I have just uploaded. I did

sage: for D in xrange(2,2**30):
....:    print get_memory_usage()
....:    Q = QuadraticField(-D)

First, the memory consumption very slowly raised from 213.60546875 to 214.10546875 and then remained steady for several minutes, until I interrupted. And I rather think that the increased consumption was just due to the increased size of D.

The problem is: The weakened maps don't prevent their domain from being GCed, but after than happens they linger (now defunct) in _coerce_from. You'll see that even with your patch in, the example in the ticket description will still eat memory--just a little less quickly. You'll find that CDF._coerce_from_hash will contain a LOT of entries.

This is clearly not the case with the current commit. After running thousands of cycles in the above "for" loop, I get

sage: CDF._introspect_coerce()
{'_action_hash': <sage.structure.coerce_dict.TripleDict at 0xa26b25c>,
 '_action_list': [],
 '_coerce_from_hash': <sage.structure.coerce_dict.MonoDict at 0xa26b3e4>,
 '_coerce_from_list': [],
 '_convert_from_hash': <sage.structure.coerce_dict.MonoDict at 0xa26b41c>,
 '_convert_from_list': [],
 '_element_init_pass_parent': False,
 '_embedding': None,
 '_initial_action_list': [],
 '_initial_coerce_list': [],
 '_initial_convert_list': []}
sage: gc.collect()
2213
sage: for k,v in CDF._introspect_coerce().iteritems():
    if v:
        print k, len(v)
....:         
_coerce_from_hash 4
_convert_from_hash 3

If we were to use a mix of _coerce_from and _coerce_to (somehow choosing which one to use) you wouldn't see this problem.

I don't see this problem anyway.

If we really want/need to, we could probably salvage the "weakened map" solution:

we could install a callback on the weakrefs.

What weakrefs are you talking about? Those to domain and codomain?

And what would the callback be supposed to do?

defunct maps are easy to recognize: they have a dead weakref in their domain. We could just periodically scrub _coerce_from for defunct maps.

Well, with the current commit, if a map becomes defunct by some garbage collection then it is removed from the _coerce_from_hash by the same garbage collection, hence, it will not linger around.

A different story is _coerce_from_list, which is a list. Here, we might need to take care of defunct maps. Perhaps the maps there shouldn't be weakened in the first place (_coerce_from_list is filled by register_coercion(), and perhaps it would make sense to keep these maps strong), but I am not sure if this wouldn't re-introduce the QuadraticField leak I have just fixed.

One possible strategy would be to keep a "reference size" for _coerce_from and every time we add an entry we check if it is now double the reference size. If it is, we trigger gc, scrub, and reset the reference size.

As I have stated above, I don't think there is a problem with _coerce_from_hash accumulating garbage. But _coerce_from_list may contain garbage, of limited size, because it is only added to by explicit calls to register_coercion().

I think a better strategy would be to make discover_coercion check whether it meets a defunct map when it performs the backtrack algorithm.

simon-king-jena commented 10 years ago

comment:63

Anyway, here is the problem I am currently having:

sage: E = ModularSymbols(11).2
sage: s = E.modular_symbol_rep()
sage: del E,s
sage: import gc
sage: gc.collect()
1309
sage: E = ModularSymbols(11).2
sage: v = E.manin_symbol_rep()
sage: c,x = v[0]
sage: y = x.modular_symbol_rep()
sage: y.parent()
Abelian Group of all Formal Finite Sums over Integer Ring
sage: A = y.parent().get_action(QQ, self_on_left=False, op=operator.mul)
sage: A.right_domain()
Abelian Group of all Formal Finite Sums over Integer Ring
sage: A.left_domain()
Rational Field
sage: A.codomain()
Traceback (most recent call last):
...
RuntimeError: This action acted on a set that became garbage collected

Hence, If I understand correctly, the "Formal Finite Sums over Rational Field" became garbage collected.

And that's where action differ from maps:

In the case of coercion maps, we have domain and codomain, which together are used as weak cache keys for the map. Hence, if either domain or codomain become garbage, then the map becomes invalid but is immediately removed from the cache and will thus not hurt.
In the case of actions, we have left domain and right domain, which together are used as weak cache keys for the action. However, in addition to the two domains, an action has a codomain, which currently is weak referenced (this was in order to fix some memory leak). If the codomain is distinct from both left and right domain and if the codomain becomes garbage, then the action becomes invalid, but it would NOT be removed from the cache, as long as domain and codomain are no garbage.

I am afraid that having a strong reference to the codomain of all actions will re-introduce a memory leak fixed elsewhere. But in the above analysis, you may notice that there only is a problem if the codomain is distinct from both left and right domain. Hence, we could be clever and have a strong reference to the codomain only in this case, and a weak reference otherwise.

I will try this now...

simon-king-jena commented 10 years ago

comment:64

I just learnt that the codomain of an action coincides with the set that is acted upon. But here, we have a sage.categories.action.PrecomposedAction. So, it composes maps phi from left and psi from right domain with an action alpha that knows about the codomains of phi and psi only. And thus perhaps we have again the problem of keeping "the middle parent" alive.

Namely, if the underlying set S of alpha is the codomain of psi, but psi is weak, then neither psi nor alpha will keep S alive. But S is sometimes not used as cache key for the precomposed action: Only the domains of psi and phi appear in the key. Hence, I think I just need to add a strong reference to the underlying set of alpha.

simon-king-jena commented 10 years ago

comment:65

The proposed change does in fact fix the problem mentioned in comment:63

simon-king-jena commented 10 years ago

comment:66

I have pushed the new commit. The example from comment:63 became a doctest.

I am now running make ptest. Let us see how many problems persist. Depending on the result, we may worry later whether or not we want to add a safe user interface to the coercion system.

simon-king-jena commented 10 years ago

comment:67

Note that I have retested

sage: for D in xrange(2,2**30):
....:    print get_memory_usage()
....:    Q = QuadraticField(-D)

with the current commit. The memory consumption stays level after a short while, and moreover the numbers returned by get_memory_usage() are now less than 190!

simon-king-jena commented 10 years ago

comment:68

Result of make ptest with the current commit:

sage -t src/sage/interfaces/maxima_abstract.py  # 1 doctest failed
sage -t src/sage/matrix/matrix2.pyx  # 1 doctest failed
sage -t src/sage/categories/group_algebras.py  # Killed due to segmentation fault
sage -t src/sage/combinat/symmetric_group_algebra.py  # Killed due to segmentation fault
sage -t src/sage/combinat/ncsf_qsym/ncsf.py  # Killed due to segmentation fault
sage -t src/sage/combinat/words/morphism.py  # 10 doctests failed
sage -t src/sage/schemes/toric/morphism.py  # 4 doctests failed
sage -t src/sage/matrix/matrix0.pyx  # 3 doctests failed
sage -t src/sage/categories/pushout.py  # Killed due to segmentation fault
sage -t src/sage/rings/polynomial/plural.pyx  # 11 doctests failed
sage -t src/sage/libs/singular/function.pyx  # 2 doctests failed
sage -t src/sage/structure/coerce.pyx  # 5 doctests failed
sage -t src/sage/libs/singular/groebner_strategy.pyx  # 2 doctests failed

So, there remains a lot of work to do.

nbruin commented 10 years ago

comment:69

Replying to @simon-king-jena:

I think this is not the case with the branch that I have just uploaded. I did

Got it! Thanks. To have documented why we're OK here: _coerce_from_hash is a MonoDict, so it holds a strong reference to the map stored. This is what keeps the map alive. However, when the domain vanishes, then the MonoDict callback will remove the entry and hence the strong reference to the map. This makes it possible for the map to be GCed. So the required callback to clean up is already triggered via the key triple.

What weakrefs are you talking about? Those to domain and codomain? And what would the callback be supposed to do?

Indeed, domain and codomain. So those callbacks are already taken care of by _coerce_from_hash being a MonoDict. The codomain is actually not relevant for this (and would be strongly referenced by the map anyway -- perhaps "weakened" maps should only have their domain reference weakened and parent (reference to the homset) cleared? If we just always keep codomain strongly referenced compositions would naturally keep things alive guaranteed anyway. Virtually all maps have a strong ref to the codomain (via generator images) internally anyway, so putting it explicitly on the outside shouldn't hurt much.

A different story is _coerce_from_list, which is a list. Here, we might need to take care of defunct maps. Perhaps the maps there shouldn't be weakened in the first place (_coerce_from_list is filled by register_coercion(), and perhaps it would make sense to keep these maps strong), but I am not sure if this wouldn't re-introduce the QuadraticField leak I have just fixed.

Do we really need _coerce_from_list? Have we identified what it does that cannot be accomplished with the data stored in _coerce_from_hash?

As I have stated above, I don't think there is a problem with _coerce_from_hash accumulating garbage. But _coerce_from_list may contain garbage, of limited size, because it is only added to by explicit calls to register_coercion().

I think a better strategy would be to make discover_coercion check whether it meets a defunct map when it performs the backtrack algorithm.

I doubt you could prove about such an incidental "bumping in" strategy that the number of defunct maps is bounded linearly in the number of active stuff (perhaps measured by active entries in _coerce_from?). That means you wouldn't be able to prove that you don't have a memory leak, and I suspect that with a little study, one could concoct an example that exhibits the leak as well. If discover_coercion would regularly visit all entries in _coerce_from_list then the process would be way too slow.

Given how delicate this code is, please do put ample documentation about assumptions and strategies in the code. Otherwise, the next person to work on this code will likely screw things up. Hopefully it'll be cought by doctests, but we've seen that this could easily not happen.

nbruin commented 10 years ago

comment:70

So one action point: Have you considered just leaving the codomain strongly referenced? Apart from the fact that _make_weak and _make_strong on map compositions still needs to recurse into components, it should make compositions easier to work with, because the first map will keep the middle (co)domain alive, as desired.

It also means _codomain can remain a fast slot.

I currently don't see a scenario where weakening the codomain reference buys us anything, especially because maps internally will usually reference the codomain.

Another action point: see if _coerce_from_list can simply be thrown away.

simon-king-jena commented 10 years ago

comment:71

Replying to @nbruin:

Indeed, domain and codomain. So those callbacks are already taken care of by _coerce_from_hash being a MonoDict. The codomain is actually not relevant for this (and would be strongly referenced by the map anyway -- perhaps "weakened" maps should only have their domain reference weakened and parent (reference to the homset) cleared? If we just always keep codomain strongly referenced compositions would naturally keep things alive guaranteed anyway. Virtually all maps have a strong ref to the codomain (via generator images) internally anyway, so putting it explicitly on the outside shouldn't hurt much.

This might actually be a good idea. Note that some wise person decided to store coerce maps on the codomain. Hence, having a strong reference from the map back to the codomain will not hurt at all, with the additional advantage you just mentioned. It would also fix the problem with composed maps and with PrecomposedAction!

Do we really need _coerce_from_list? Have we identified what it does that cannot be accomplished with the data stored in _coerce_from_hash?

I don't know why it had originally been introduced. But when I added the weak versions of triple and mono dict, I actually thought of it as a way to make some coercions (namely those explicitly registered) permanent.

I doubt you could prove about such an incidental "bumping in" strategy that the number of defunct maps is bounded linearly in the number of active stuff (perhaps measured by active entries in _coerce_from?). That means you wouldn't be able to prove that you don't have a memory leak, and I suspect that with a little study, one could concoct an example that exhibits the leak as well.

Sure. But the more you have to struggle to construct a leak, the more happy I'll be... :-P

Given how delicate this code is, please do put ample documentation about assumptions and strategies in the code. Otherwise, the next person to work on this code will likely screw things up. Hopefully it'll be cought by doctests, but we've seen that this could easily not happen.

OK, I'll try to be explicit in either the comments in the code, or in the docs.

simon-king-jena commented 10 years ago

comment:72

I can confirm that with a strong reference to the codomain, the leak is still fixed. Now trying to see if it also prevents some of the crashes I've seen in the tests!

simon-king-jena commented 10 years ago

comment:73

Outch. Sage started, but the first four tests I've run have segfaulted. Very bad. How can that be?

simon-king-jena commented 10 years ago

comment:74

Replying to @simon-king-jena:

Outch. Sage started, but the first four tests I've run have segfaulted. Very bad. How can that be?

This is really getting scary. I see in an interactive session that the leak is fixed, but according to the corresponding doctest it isn't fixed.

simon-king-jena commented 10 years ago

comment:75

Replying to @simon-king-jena:

This is really getting scary. I see in an interactive session that the leak is fixed, but according to the corresponding doctest it isn't fixed.

What a fun...

When counting the number of quadratic number fields tracked by gc, I did

numberQuadFields = len([x for x in gc.get_objects() if isinstance(x, C)])

And apparently,just by chance, x became a pointer to the number field that I wanted to delete. Hence, I had to add del x to make the tests pass.

simon-king-jena commented 10 years ago

comment:76

I am now back at trying to trace down the errors mentioned in comment:68. Frustratingly, with my current local branch (not yet posted), I can not reproduce all the errors, even though the code during comment:68 looked nearly the same.

simon-king-jena commented 10 years ago

comment:77

One rather puzzling error:

sage: import gc
sage: SG4 = SymmetricGroupAlgebra(ZZ,4)
sage: gc.collect()
1051
sage: SG4(1).is_central()
------------------------------------------------------------------------
./sage: Zeile 134: 14576 Speicherzugriffsfehler  "$SAGE_ROOT/src/bin/sage" "$@"

Why is this puzzling? Well, it does not occur when I make both domain and codomain weak references, for coerce maps. But it does occur when I only make the domain a weak reference.

So, is it the case that "the more weak references, the more crash safe"??

nbruin commented 10 years ago

comment:78

Replying to @simon-king-jena:

numberQuadFields = len([x for x in gc.get_objects() if isinstance(x, C)])

Perhaps use

numberQuadFields = sum(1 for x in gc.get_objects() if isinstance(x, C))

instead, to prevent leaking x?

simon-king-jena commented 10 years ago

comment:79

It reduces to

sage: SG4 = SymmetricGroupAlgebra(ZZ,4)
sage: SG4._introspect_coerce()['_coerce_from_list']
[Generic morphism:
  From: Integer Ring
  To:   Symmetric group algebra of order 4 over Integer Ring,
 Generic morphism:
  From: Symmetric group algebra of order 3 over Integer Ring
  To:   Symmetric group algebra of order 4 over Integer Ring]
sage: phi, psi = _
sage: import gc
sage: gc.collect()
1062
sage: phi
Generic morphism:
  From: Integer Ring
  To:   Symmetric group algebra of order 4 over Integer Ring
sage: psi
Traceback (most recent call last):
...
ValueError: This map is in an invalid state, domain or codomain have been garbage collected

So, perhaps we need to keep stuff strong in the coerce_from_list after all.

simon-king-jena commented 10 years ago

comment:80

PS: But why does it not crash if the coerce maps' codomains get weakref'd too?

simon-king-jena commented 10 years ago

comment:81

For the record: When I do not weaken the maps on _coerce_from_list or _convert_from_list, the crash vanishes, but the memory leak is still fixed.

Also in my unpublished branch: Skip invalid maps in discover coercion (in the backtracking algorithm). But this is just an additional safety. I think we want that explicitly registered maps (i.e., those which are the fundamental path ways in the backtracking algorithm) will stay alive. Hence, it might even be worth while to keep an explicit reference to the domain of the map, so that the map will remain valid even if something weakens it.

I am now running tests again.

simon-king-jena commented 10 years ago

comment:82

To summarise the requirements found in our discussion for a stable and memory friendly coercion system:

Maps registered by P.register_coercion(mor) are the backbone of the discover_coercion algorithm. Hence, they need to be kept healthy as long as P lives.
If P has a coerce embedding then this embedding needs to be kept healthy as long as P lives.
If the coercion system finds a coercion from P to Q different from the two preceding cases (i.e., by transitive closure), then P must not prevent Q and the coercion map from garbage collection, and Q must not prevent P and the coercion map from garbage collection. But the coercion map needs to stay healthy as long as both P and Q are alive. In particular, take care of composed maps and actions.

To summarise how I think we can meet these requirements:

P.register_coercion(mor) keeps a strong reference to mor.domain() in a new attribute P._registered_domains (which is a list). All other maps in the coercion system will be weakened.
Homsets keep strong references to domain and codomain.
"Weakening a map" means to use a weak reference to the domain and a removed reference to the homset. But we will keep a strong reference to the codomain. Consequence: (1) In a composed map or action, the "middle parents" will always stay alive as long as the composed map lives. (2) Embeddings do what we required above: The embedding keeps the codomain alive.
We store the map phi found by P.discover_coercion(Q) in P._coerce_from_hash, which is a monodict. In particular, if Q will be garbage collected, then phi is immediately removed from the cache, and thus the strong reference of phi to the codomain P will not prevent P from garbage collection. And if P will be garbage collected, then the whole cache including phi is removed. However, if both P and Q live, then phi will stay in P._coerce_from_hash[Q], and it will stay healthy, because its domain Q is still alive (and hence the weak reference is still ok).

I think this model makes sense.

And concerning a safe user interface: Perhaps we can do without. Namely, we could make it so that the string representation of a weakened map will consist of a warning, suggesting to use a copy. In this way, it would be sufficiently likely that the user wouldn't have too many bad surprises.

nbruin commented 10 years ago

comment:83

Replying to @simon-king-jena:

Maps registered by P.register_coercion(mor) are the backbone of the discover_coercion algorithm. Hence, they need to be kept healthy as long as P lives.

It's not clear to me this is the case. If mor is a map from S to P then P.register_coercion(mor) ensures the coercion system knows how to map an element from S into P. Why is it important to maintain this information for the lifetime of P? If S ceases to exist, then one would not need to map elements from S to P. Why do we need to ensure S keeps existing? Shouldn't P have a reference to S independent of the coercion system if P's health depends on the existence of S? I think the coercion system is there to keep track of relations between existing objects, not to keep objects in existence (although this might be a side-effect of keeping track of relations between other objects).

If P has a coerce embedding then this embedding needs to be kept healthy as long as P lives.

I presume this is the function of the _embedding coercion attribute? So it looks like we already have that. What if we want multiple embeddings?

P.register_coercion(mor) keeps a strong reference to mor.domain() in a new attribute P._registered_domains (which is a list). All other maps in the coercion system will be weakened.

So what means are there available to add a coercion without tying the lifespan of one to the other? Shouldn't it be possible to specify such things too?

We store the map phi found by P.discover_coercion(Q) in P._coerce_from_hash, which is a monodict. In particular, if Q will be garbage collected, then phi is immediately removed from the cache, and thus the strong reference of phi to the codomain P will not prevent P from garbage collection.

It never did prevent collection: The reference to phi is held by P, so the reference from phi to P would be recognized as a cycle, so the cyclic GC would find it (and parents usually are involved in cycles already, so cyclic GC is their only chance for being reclaimed anyway). Your statement is still true.

And concerning a safe user interface: Perhaps we can do without. Namely, we could make it so that the string representation of a weakened map will consist of a warning, suggesting to use a copy. In this way, it would be sufficiently likely that the user wouldn't have too many bad surprises.

Hm, we'll see.

simon-king-jena commented 10 years ago

comment:84

With my current not yet published code, I get errors only in one file, namely with scheme morphisms. And this actually isn't a surprise, since scheme morphisms almost completely ignore the existing methods they inherit from sage.categories.map.Map. They have a custom __init__ doing more or less the same what the old version of Map.__init__ used to do, and they even override domain() and codomain().

So, this should be fixable, and I suppose I will be able to post a commit so that all tests pass, later today.

Replying to @nbruin:

Replying to @simon-king-jena:

Maps registered by P.register_coercion(mor) are the backbone of the discover_coercion algorithm. Hence, they need to be kept healthy as long as P lives.

It's not clear to me this is the case. If mor is a map from S to P then P.register_coercion(mor) ensures the coercion system knows how to map an element from S into P. Why is it important to maintain this information for the lifetime of P? If S ceases to exist, then one would not need to map elements from S to P.

One does! Namely, suppose that you first do P.register_coercion(mor), then you allow S to become garbage collected, and then you create a new parent Q with an embedding into what looks like S (of course, it now is a replica of the original now garbage collected S).

You would want that Q coerces into P via S, by transitivity of coercions. But you collected mor.domain() and thus you will fail to find this coercion.

Another attempt to explain why I think the explicitly registered maps need to be kept:

Think of the coercions as a digraph.

The vertices correspond to parents.
The arrows (oriented edges) of the digraph correspond to those maps that are declared as coercions by .register_coercion(...).
In addition to the arrows created with register_coercion, the coercion system may find short-cuts by calling ._coerce_map_from_(...). For efficiency reasons, these short-cuts are stored in a cache.
The coercion system then tries to find directed paths between two given vertices, including short-cuts. For efficiency reasons, it stores the resulting composed maps in a cache.

How does the coercion system find "directed paths with short-cuts" from Q to P?

Of course, it would be impossible to just guess a parent S such that Q has a short-cut into S and S has a short-cut into P.
One could search through all paths alpha (paths without shortcuts, I mean) starting at Q and simultaneously through all paths beta ending at P, and then test whether there is a short-cut from the endpoint of alpha to the startpoint of beta. It would work, but I guess it would be very time consuming.
The current coercion system will therefore just search (by backtracking) through all shortcut-free paths beta ending at P, end then test whether there is a short-cut from Q to the startpoint of beta. In addition, the coerce embedding registered for Q will be considered in forward direction.

Remark

Perhaps in the long run, we could actually use a proper digraph (with a lightning fast backend) to keep track of coercions.

Now for garbage collection:

Why do we need to ensure S keeps existing? Shouldn't P have a reference to S independent of the coercion system if P's health depends on the existence of S? I think the coercion system is there to keep track of relations between existing objects, not to keep objects in existence (although this might be a side-effect of keeping track of relations between other objects).

I think that in many cases it would indeed be the case that P should reference S independent of coercions (say, if S is the base ring of P). However, it is conceivable (and I think I've met such cases in failing doctests) that P can remain perfectly healthy without S, but still we would like to find coercions from Q to P via S.

So, the coercion system will not care for the health of P, but it must care for the connectivity of the coercion graph. And in the current algorithm sketched above, it is of vital importance to keep arrows leading to P alive, since otherwise we have to live with shortcuts only.

If P has a coerce embedding then this embedding needs to be kept healthy as long as P lives.

I presume this is the function of the _embedding coercion attribute? So it looks like we already have that. What if we want multiple embeddings?

Then we look into the code and see that multiple embeddings are explicitly excluded. A parent has precisely one embedding that is taken into account by the coercion search in forward direction. You may of course register further embeddings as coercions, by emb.codomain().register_coercion(emb), but they would only be used for search in backward direction.

P.register_coercion(mor) keeps a strong reference to mor.domain() in a new attribute P._registered_domains (which is a list). All other maps in the coercion system will be weakened.

So what means are there available to add a coercion without tying the lifespan of one to the other?

Note that after P.register_coercion(mor), mor.domain() will live at least as long as P. But mor.domain() would not be enough to keep P alive.

And there surely is a means available to add a coercion that doesn't tie the lifespan of two parents too closely: Implement P._coerce_map_from_(Q), which can return a map or simply "True" (in the latter case, conversion is used to coerce Q into P). The result is cached in P._coerce_from_hash, but not in P._coerce_from_list.

Shouldn't it be possible to specify such things too?

Perhaps in the long run? I wouldn't like to do those things (such as: Rewrite the coercion system to use a proper digraph backend) here.

simon-king-jena commented 10 years ago

comment:85

Replying to @simon-king-jena:

With my current not yet published code, I get errors only in one file, namely with scheme morphisms. And this actually isn't a surprise, since scheme morphisms almost completely ignore the existing methods they inherit from sage.categories.map.Map. They have a custom __init__ doing more or less the same what the old version of Map.__init__ used to do, and they even override domain() and codomain().

Arrgh, it is even worse: SchemeMorphism just inherits from Element, not from Map! Unbelievable.

simon-king-jena commented 10 years ago

Author: Simon King

simon-king-jena commented 10 years ago

comment:86

I have pushed two more commits, and now all doctests should pass (to be verified).

I also think I have extensively explained the rationale behind the changes. Hence, I have already clicked "Needs review". But it was too soon (see below).

A potential "todo": We might want that a weakened map returned by P.coerce_map_from(Q) will print as

sage: QQ['x'].coerce_map_from(QQ)
This is a map used by the coercion system.
If you want to use it outside of the coercion system,
please use a copy of the map
sage: copy(_)
Polynomial base injection morphism:
  From: Rational Field
  To:   Univariate Polynomial Ring in x over Rational Field

In this case, we may want to have __copy__ methods everywhere. Even though the current branch improves copying, I just obtained the following crash that I want to fix first:

sage: QQ['x'].coerce_map_from(ZZ)
Composite map:
  From: Integer Ring
  To:   Univariate Polynomial Ring in x over Rational Field
  Defn:   Natural morphism:
          From: Integer Ring
          To:   Rational Field
        then
          Polynomial base injection morphism:
          From: Rational Field
          To:   Univariate Polynomial Ring in x over Rational Field
sage: phi = copy(_)
sage: phi(2)
<SEGFAULT>

simon-king-jena commented 10 years ago

Work Issues: Provide copy of composed maps

simon-king-jena commented 10 years ago

comment:87

Done, and now I'd say it can be reviewed. With the latest commit, one can do

sage: phi = QQ['x'].coerce_map_from(ZZ)
sage: phi.domain
<weakref at 0xa225284; to 'sage.rings.integer_ring.IntegerRing_class' at 0x96ddd3c (EuclideanDomains.parent_class)>
sage: type(phi)
<type 'sage.categories.map.FormalCompositeMap'>
sage: psi = copy(phi)
sage: psi
Composite map:
  From: Integer Ring
  To:   Univariate Polynomial Ring in x over Rational Field
  Defn:   Natural morphism:
          From: Integer Ring
          To:   Rational Field
        then
          Polynomial base injection morphism:
          From: Rational Field
          To:   Univariate Polynomial Ring in x over Rational Field
sage: psi(3)
3
sage: psi.domain
The constant function (...) -> Integer Ring

which required implementing __copy__ for polynomial basering injections and formal composite maps.

If you want me to make weakened coercion maps print as a big warning, then I could of course do so, but will only do if you ask.

simon-king-jena commented 10 years ago

comment:88

Too bad. Apparently the elliptic curve code does not like SchemeMorphism to be a Morphism...

simon-king-jena commented 10 years ago

Changed work issues from Provide copy of composed maps to Fix elliptic curves code

simon-king-jena commented 10 years ago

comment:89

Namely:

sage: E=EllipticCurve('37a1')
sage: P=E(0,0)
sage: Q=5*P
<Booom>

apparently while trying to create an action.

simon-king-jena commented 10 years ago

comment:90

Aha.

sage: P.__class__.mro()
[sage.schemes.elliptic_curves.ell_point.EllipticCurvePoint_number_field,
 sage.schemes.elliptic_curves.ell_point.EllipticCurvePoint_field,
 sage.schemes.projective.projective_point.SchemeMorphism_point_abelian_variety_field,
 sage.structure.element.AdditiveGroupElement,
 sage.structure.element.ModuleElement,
 sage.schemes.projective.projective_point.SchemeMorphism_point_projective_field,
 sage.schemes.projective.projective_point.SchemeMorphism_point_projective_ring,
 sage.schemes.generic.morphism.SchemeMorphism_point,
 sage.schemes.generic.morphism.SchemeMorphism,
 sage.categories.morphism.Morphism,
 sage.categories.map.Map,
 sage.structure.element.Element,
 sage.structure.sage_object.SageObject,
 object]

and

sage: P.domain
<bound method EllipticCurvePoint_number_field.domain of (0 : 0 : 1)>
sage: P.domain.__module__
'sage.schemes.elliptic_curves.ell_point'
sage: P.domain??
Type:       instancemethod
String Form:<bound method EllipticCurvePoint_number_field.domain of (0 : 0 : 1)>
File:       /home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/schemes/elliptic_curves/ell_point.py
Definition: P.domain(self)
Source:
    def domain(self):
        """
        Return the domain of this point, which is `Spec(F)` where `F` is
        the field of definition.

        EXAMPLES::

            sage: E=EllipticCurve(QQ,[1,1])
            sage: P=E(0,1)
            sage: P.domain()
            Spectrum of Rational Field
            sage: K.<a>=NumberField(x^2-3,'a')
            sage: P=E.base_extend(K)(1,a)
            sage: P.domain()
            Spectrum of Number Field in a with defining polynomial x^2 - 3
       """
        return self.parent().domain()

So, not only was SchemeMorphism ignoring Morphism, but additionally its sub-class EllipticCurvePoint_fieldEllipticCurvePoint_field overrode what it inherited from SchemeMorphism by a verbosely identical copy.

The elliptic curve code is a mess.

simon-king-jena commented 10 years ago

comment:91

In sage.schemes.elliptic_curves.heegner, it also seems to me that GaloisAutomorphism should use Morphism. I don't know if I will change it here.

simon-king-jena commented 10 years ago

comment:92

Hmm. If we are in very bad luck, then the current commit triggers a bug in Cython concerning confusion of different cpdef slots when one provides two different base classes (ModuleElement and Morphism). Namely, it pretty much seems that

   (<ModuleElement>left)._add_(<ModuleElement>right)

does not return

   left._add_(right)

simon-king-jena commented 10 years ago

comment:93

Hooray!

----------------------------------------------------------------------
All tests passed!
----------------------------------------------------------------------
Total time for all tests: 3777.0 seconds
    cpu time: 5867.6 seconds
    cumulative wall time: 7256.6 seconds

I added a "todo" to the documentation of sage.schemes.generic.morphism, stating that SchemeMorphism should rather inherit from Morphism, but currently can not, because of a bug in Cython. I modified the code so that it now exactly imitates the new Morphism code. In this way, a future transition to a sub-class of Morphism should be easier.

Anyway, since the tests pass, I'd say it is "needs review" for now.

Plan: If you believe that the new approach makes sense, then I'll also add a longer section to the reference manual, explaining how the weakref business is supposed to work.

simon-king-jena commented 10 years ago

comment:94

By the way, I was looking into my logs/ptest.log, and it seems that the total cpu time for the tests remained essentially the same. Hence, hopefully there is no bad slow-down. But we should probably take some old benchmarks used at, say, #11900, and see what my new commits do with them.

simon-king-jena commented 10 years ago

comment:95

Here are the tests from #11900.

In Commit:05fb569, I get

sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 10.76 s, sys: 0.18 s, total: 10.93 s
Wall time: 10.95 s
sage: %time TestSuite(CrystalOfTableaux(['B',4],shape=[2,1,1,0])).run()
CPU times: user 3.12 s, sys: 0.00 s, total: 3.12 s
Wall time: 3.13 s
sage: W.<z> = CyclotomicField(13)
sage: %time M = Matrix(W, 2, 3, [10^30*(1-z)^13, 1, 2, 3, 4, z]).echelon_form()
CPU times: user 2.28 s, sys: 0.02 s, total: 2.29 s
Wall time: 2.30 s
sage: %time L = EllipticCurve('960d1').prove_BSD()
CPU times: user 4.80 s, sys: 0.04 s, total: 4.84 s
Wall time: 4.85 s
sage: def test(E):
....:     for p in prime_range(10000):
....:         if p != 389:
....:             G = E.change_ring(GF(p)).abelian_group()
....:
sage: E = EllipticCurve('389a')
sage: %time test(E)
CPU times: user 31.56 s, sys: 0.09 s, total: 31.65 s
Wall time: 31.70 s
sage: %time for E in cremona_curves([11..100]): S = E.integral_points(both_signs=False)
CPU times: user 21.96 s, sys: 0.06 s, total: 22.02 s
Wall time: 22.07 s

In public/sage-git/master, I get

sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 10.72 s, sys: 0.17 s, total: 10.90 s
Wall time: 10.91 s
sage: %time TestSuite(CrystalOfTableaux(['B',4],shape=[2,1,1,0])).run()
CPU times: user 3.23 s, sys: 0.00 s, total: 3.23 s
Wall time: 3.24 s
sage: W.<z> = CyclotomicField(13)
sage: %time M = Matrix(W, 2, 3, [10^30*(1-z)^13, 1, 2, 3, 4, z]).echelon_form()
CPU times: user 2.21 s, sys: 0.02 s, total: 2.23 s
Wall time: 2.23 s
sage: %time L = EllipticCurve('960d1').prove_BSD()
CPU times: user 5.27 s, sys: 0.04 s, total: 5.32 s
Wall time: 5.32 s
sage: def test(E):
....:    for p in prime_range(10000):
....:        if p != 389:
....:            G = E.change_ring(GF(p)).abelian_group()
....:
sage: E = EllipticCurve('389a')
sage: %time test(E)
CPU times: user 32.06 s, sys: 0.11 s, total: 32.17 s
Wall time: 32.22 s
sage: %time for E in cremona_curves([11..100]): S = E.integral_points(both_signs=False)
CPU times: user 22.11 s, sys: 0.07 s, total: 22.17 s

These tests have been found sensitive against slowness in category and coercion framework. So, the fact that there is not the faintest regression in these tests gives some confidence.

nbruin commented 10 years ago

comment:96

Replying to @simon-king-jena:

One does! Namely, suppose that you first do P.register_coercion(mor), then you allow S to become garbage collected, and then you create a new parent Q with an

embedding into what looks like S (of course, it now is a replica of the original now garbage collected S).

You would want that Q coerces into P via S, by transitivity of coercions. But you collected mor.domain() and thus you will fail to find this coercion.

I agree that there is a place for such strong connections, but I have severe reservations about declaring it's the only way or even the default way to inform the system about coercions.

The following would leak memory with the model you propose:

M1=ZZ^1
M2=ZZ^2
for d in [2..100000]:
    N=ZZ^d
    m1=Hom(N,M1)([M.0 for i in range(d)])
    m2=Hom(N,M2)([M.(i % 2) for i in range(d)])
    M1.register_coercion(m1)
    M2.register_coercion(m2)

I have severe reservations about declaring that this code "will never be memory efficient" in sage. One way out is to use the (for this purpose) extremely inappropriately named

    N.register_embedding(m1)

instead, which in your model would mean M1 lives at least as long as N instead. In addition, we would not be able to do this for both M1 and M2.

I think there's room for a register_embedding with those semantics (although it should probably have a different name, because it doesn't have to be an embedding) as well as for register_coercion_from_while_keeping_domain_alive with a catchier name, but the life-enhancing side effects are fundamentally separate from the fact that there's a coercion in the first place, and I think it should be possible to register a non-life-enhancing coercion as well (the main reason is that it's easily done and that I haven't seen a proof it's not necessary. We don't want to unnecessarily hamstring the system).

philosophical ramblings

The following observations are the result of trying to come up with a conceptual ideal/model that we could use to argue what our coercion framework SHOULD do. Up to now it seems to me people have mainly been making ad-hoc, gut feeling decisions about how to put together coercion. I haven't found a better solution here, but I think there are some examples may illustrate we might just have to be pragmatic about this.

Your suggestion makes sense if you want to model a "permanent, unchanging universe" in which all possible parents, with consistent coercions, already exist; we're simply "discovering" this universe in a piecemeal fashion. It's a common model in modern algebra, but I think our computer algebra model is too far away from this ideal to follow this model too far. Consider:

Qx.<x>=QQ[]
K.<a>=NumberField(x^4-2)
L.<b>=NumberField(x^2-2,embedding=a^2)

This fits perfectly in the "unchanging universe" model. Also note that the coercion system does not need to let L keep K alive, since the construction parameters, which get kept alive for the life of L by CachedRepresentation or something analogous, refer to K already.

Now consider

M.<b>=NumberField(x^2-2)

In the "unchanging universe" (and in sage as well) we have that M is distinct from L. However, I think it's unrealistic to expect that all "embeddings" etc. can be specified at construction time. So I think, even though it's not possible currently in sage, that one should allow for

m1=Hom(M,K)([a^2])
m2=Hom(M,K)([-a^2])
M.register_embedding(m1)

Note that the choice of m1 or m2 here leads to different relations between M and K and hence different universes. In other words, our concept of "globally unique" is not powerful enough to capture the full identity of objects, which would include the coercion relations with objects that haven't been "discovered" yet. In practice, we can usually work around that by for instance changing the names of generators and hence create artificially differently labelled objects but that's already not a possibility for creating different copies of ZZ^n, since there are no generator names to choose there.

I think one has to accept the reality here: what we have is a collection of objects whose relations do change in time.

Some particular responses (most of them direct corollaries from what is observed above).

Another attempt to explain why I think the explicitly registered maps need to be kept:

Think of the coercions as a digraph.

The vertices correspond to parents.

The arrows (oriented edges) of the digraph correspond to those maps that are declared as coercions by .register_coercion(...).

In addition to the arrows created with register_coercion, the coercion system may find short-cuts by calling ._coerce_map_from_(...). For efficiency reasons, these short-cuts are stored in a cache.

The coercion system then tries to find directed paths between two given vertices, including short-cuts. For efficiency reasons, it stores the resulting composed maps in a cache.

That's not the only thing coercion does. It may also find "common covering structures", which may lead to construction of new parents. Those definitely don't deserve to get nailed into memory. Yet, the code that creates these parents will look (to the coercion system) as a "user", so it would be using these strong-referencing coercion registration routines.

So, the coercion system will not care for the health of P, but it must care for the connectivity of the coercion graph. And in the current algorithm sketched above, it is of vital importance to keep arrows leading to P alive, since otherwise we have to live with shortcuts only.

I think it may well be a feature, not a bug, that one at some point can just be left with the shortcuts and that the intermediates have fallen out. The natural way of staying close to "discovering a permanent universe" is by never throwing away anything that has been discovered, and I think we agree that's a "memory leak". So it's really a matter of only keeping things around that we still need. If we're too lax with considering what is "needed", we'll end up keeping too many things in memory.

Then we look into the code and see that multiple embeddings are explicitly excluded. A parent has precisely one embedding that is taken into account by the coercion

search in forward direction. You may of course register further embeddings as coercions, by emb.codomain().register_coercion(emb), but they would only be used for search in backward direction.

Not only that: you'd be tying different lifetime implications to that construction too.

And there surely is a means available to add a coercion that doesn't tie the lifespan of two parents too closely: Implement P._coerce_map_from_(Q), which can return a map or simply "True" (in the latter case, conversion is used to coerce Q into P). The result is cached in P._coerce_from_hash, but not in P._coerce_from_list.

You mean: implement P._install_coerce_map_from_(m), which does: _coerce_from_hash[Domain(m)]=m. I think it is quite important to be able to manipulate the coercion graph without having to modify library code.

nbruin commented 10 years ago

comment:97

Congratulations on making such good headway! The main thing left is to determine how weakly referenced the coercion framework needs to be.

Previous Next