Coercion discovery fails to be transitive

nbruin commented 11 years ago

As found in #14711:comment:134, the following example shows that a combination of register_embedding and register_coercion can lead to a failure in transitivity for coercion discovery; also discussed on sage-devel:

class pA(Parent): pass
class pB(Parent): pass
class pC(Parent): pass

A=pA(); B=pB(); C=pC()

BtoA=Hom(B,A)(lambda x: A(x))
AtoC=Hom(A,C)(lambda x: C(x))
A.register_coercion(BtoA)
A.register_embedding(AtoC)

G=get_coercion_model()
G.discover_coercion(A,C) #finds AtoC
G.discover_coercion(B,A) #finds BtoA
G.discover_coercion(B,C) #does not find the composition of BtoA with AtoC

One workaround is simple: just don't use register_embedding. However, after #14711, there are different lifetime implications between using register_embedding and register_coercion, so this workaround might not be desirable.

Depends on #14711 Depends on #15329 Depends on #15331

CC: @simon-king-jena @robertwb @jpflori

Component: coercion

Work Issues: rebase (11 files conflicting)

Author: Simon King

Branch/Commit: u/SimonKing/ticket/15303 @ 12e8055

Issue created by migration from https://trac.sagemath.org/ticket/15303

nbruin commented 11 years ago

comment:1

The problem is that in the present implementation, both coercions are stored on A, so if a coercion from B to C is requested, it hard to realize that A should even be considered.

As far as I understand, the coercion model should behave as a digraph on the parents, where a coercion between parents exists if there is a path from one to the other. In that model, coercion existence should be transitive, so the behaviour described is a bug.

One way to work around it is, at least for coercion discovery, to store the coercion always on the codomain. By #14711 this can now happen without the implication that the codomain will keep the domain alive. We would have to store it in a place where it can be used for coercion discovery, though.

If it is desirable to keep the current lifetime implications for register_embedding (and it probably is) then we should ensure this separately, either by also storing the embedding map on the domain or by just having a direct reference from the domain to the codomain.

simon-king-jena commented 11 years ago

comment:2

It has also been discussed on sage-devel that the existence of a coercion from B to C would change over time. Namely:

Create B and C. You will not find a coercion, because A is not known yet.
Create A, registering a coercion from B to A and an embedding of A into C.
Provided that we fixed the problem of transitivity, we would now find a coercion from B to C via A.

I suppose that this phenomenon can not be avoided: We can not have a static coercion digraph (otherwise we would restrict ourselves to a fixed finite set of usable parents), and when we have a dynamic coercion graph, then it is, well, dynamic.

However, one additional complication arises with the current implementation: The absence of a coercion is cached. Hence, if we asked for a coercion in 1., then in 3. you would still not get a coerce map, because the absence of a coercion has been cached in 1.

Let phi: A -> B and do A.register_embedding(phi). Currently, B is not aware of the existence of phi. Hence, the backtracking algorithm will currently ignore phi. We don't want to store phi as a strong attribute of B, hence, don't want to put it into B._coerce_from_list. But perhaps we could create a new attribute of type MonoDict and store the embedding there? For example B._coerce_embeddings_from[A] = phi, in addition to A._coerce_embedding = phi. Since it is a MonoDict and thus has weak references to the keys, B would not prevent A from being garbage collected (but of course A would still keep B alive, because we registered the coerce embedding).

Once this is done, we could change the backtracking so that it not only iterates over B._coerce_from_list, but it additionally iterates over B._coerce_embeddings_from.

But what about the problem of caching coercions? We should carefully think if the current scenario (A.register_coercion(B) plus A.register_embedding(C)) is the only scenario that could dynamically change the coercion graph. Here, I assume that self.register_embedding() and self.register_coercion() are only called during self.__init__().

How to make it possible to allow a coercion via a newly created parent?

Perhaps the following would be feasible: If we do A.register_embedding(psi), where psi: A -> C, then we clear the coercion cache of C, in the sense of all cache items stating the absence of a coercion to C will be deleted.

Note that clearing it in the sense of all cached coercions to C will be deleted is likely to be a bad idea, because the cached coercions might have already been used. So, we should restrict ourselves to allowing new coercions by removing the "there is no coercion" flag.

So, would the suggestion make sense to you?

Have a new MonoDict that stores all coerce embeddings leading into the parent (hence, we would have a strong reference from domain to codomain, and I suggest adding a weak reference from codomain to domain), that will be used in the backtracking algorithm.
When registering an embedding, then remove all "there is no coercion" flags from the coercion cache of the codomain.

Hm. Probably this is not good enough. What if we have had D.coerce_map_from(C), and had cached the absence of a coercion from B to D? After adding A, we would find a coercion from B to D via A and C. Hence, cleaning the coercion cache of C would not be enough---we would need to clean the coercion cache of all parents into which C coerces.

Given this example, it seems to me that we could only solve the problem in a drastic way: Do not cache the absence of a coercion at all. But caching the absence of a coercion is essential for speed. So, the drastic solution would probably be correct, but highly infeasible.

If nobody has a better suggestion, then I think we should restrict ourselves to just fix the transitivity problem; the cache problem (it seems to me) is not solvable without creating a drastic slow-down.

nbruin commented 11 years ago

comment:3

Replying to @simon-king-jena:

Provided that we fixed the problem of transitivity, we would now find a coercion from B to C via A.

There would be an additional effect, by the way: If the coercion that does get discovered is stored as a composition (on C) then there is now a reference from C._coerce_from_hash to A. This reference lives in a MonoDict keyed by B, so this reference remains as long as B is alive. So we see that as long as B and C are alive, then A will be kept alive as well (thus, with monodicts we can have objects whose lifetime depends on two objects BOTH being alive)

Note that if the composition gets computed in a smarter way (for instance, a composition of homomorphisms between finitely generated rings could be explicitly computed by computing the images of the generators and constructing a straight homomorphism from B to C out of that) then the resulting coercion map might not refer to A anymore.

I am not saying that this is avoidable nor that we should consider this a memory leak: we're keeping A in memory for a good reason, even if the reason is not directly supplied by the user.

However, one additional complication arises with the current implementation: The absence of a coercion is cached. Hence, if we asked for a coercion in 1., then in 3. you would still not get a coerce map, because the absence of a coercion has been cached in 1.

There are milder ways of invalidating caches: We could mark cached non-existence with a "coercion graph version number". When a non-existence is retrieved, one could check the current "coercion graph version number" and if the current graph is newer, we'd have to reinvestigate the "None". Frankly, I don't believe that would give acceptable performance either, but we could at least be much more lazy about invalidating "no coercion exists" findings. The reason why I think this would still not be good enough is because I expect that coercion graph mutations occur fairly frequently (basically with any parent creation) and I don't see a way to localize coercion graph versions, so any mutation would invalidate ALL non-existence caches.

For example B._coerce_embeddings_from[A] = phi, in addition to A._coerce_embedding = phi. Since it is a MonoDict and thus has weak references to the keys, B would not prevent A from being garbage collected (but of course A would still keep B alive, because we registered the coerce embedding).

Yes, something along those lines. In fact, since _coerce_embeddings_from and _coerce_from_list would serve the same purpose from the perspective of coercion discoveries, I think the entries in _coerce_from_list should also be added to _coerce_embeddings_from, because it would simplify the code. By then it would make more sense to call the attribute _coerce_from_to_be_used_in_backtracking and the list that is now _coerce_from_hash could be _coerce_from_cached_results.

By then we should probably be storing ALL maps there in weakened form.

The only function of _coerce_from_list now is to keep domains alive, so it would be more economical to replace that by a list _domains_to_keep_alive where we can store strong references to the domains of the corresponding weakened maps that are now in _coerce_from_to_be_used_in_backtracking

So we would end up with (names up for choice):

P._coercion_cache: MonoDict containing discovered (weakened) maps into P.
P._coercion_maps: MonoDict containing (weakened) maps into P that get used by backtracking
P._domains_to_keep_alive: List of domains that should be kept alive as long as P lives.
P._codomains_to_keep_alive: List of codomains that should be kept alive as long as P lives. Normally, such codomains Q would have an entry in Q._coercion_maps[P]. Just storing a map in P._coerce_embedding would also have this effect.

In the present naming system, it wasn't immediately clear to me what purposes _coerce_from_hash and _coerce_from_list served, so changing those names could be beneficial.

If nobody has a better suggestion, then I think we should restrict ourselves to just fix the transitivity problem; the cache problem (it seems to me) is not solvable without creating a drastic slow-down.

Yes, I don't have any suggestions that I seriously believe would lead to acceptable performance.

simon-king-jena commented 11 years ago

comment:4

I was thinking about "versioning" of the coercion graph as well. Perhaps it would be worth trying.

The idea would be: We have one global variable, say, cdef unsigned int coerce_graph_version. Whenever we create a node that is neither a sink nor a source in the coerce graph, we increment the variable. This would be cheap, a simple test in .register_coercion() and .register_embedding().

Instead of storing None when a coercion can not be found, we store the current version. Hence, if we do mor = self._coerce_from_hash[P], a simple isinstance(mor,int) (or a faster version from the Python API that tests if the type of mor is exactly int) will tell us whether the absence of a coercion was cached. If mor==coerce_graph_version then we know that the cached absence of a coercion is reliable. Otherwise, we need to test.

Would this really be soooo expensive? I don't think so. Of course, it depends on how often we create non-sink non-source nodes---and I doubt that the combined use of .register_embedding() and .register_coercion() (which is the only way to create non-sink non-source) will happen very often.

So, I'd say we simply try.

nbruin commented 11 years ago

comment:5

Replying to @simon-king-jena:

Would this really be soooo expensive? I don't think so. Of course, it depends on how often we create non-sink non-source nodes---and I doubt that the combined use of .register_embedding() and .register_coercion() (which is the only way to create non-sink non-source) will happen very often.

I don't think you can establish whether something is a non-sink, since _coerce_from_map gives programmatic answers about that. Things are very rarely going to be non-sinks, though, since usually ZZ will coerce into them (but ZZ will never be a problem. perhaps we can leave it out of consideration)

A start might be to only version up on "embedding" creations. That might very well be usable. I expect "embeddings" to be relatively rare, and I think we can declare in our documentation they are not the preferred way of expressing relations (they are vary prone to creating non-commutative diamonds).

simon-king-jena commented 11 years ago

comment:6

Replying to @nbruin:

I don't think you can establish whether something is a non-sink, since _coerce_from_map gives programmatic answers about that.

Perhaps I have not been clear. A non-sink non-source is something that has both in-arrows (i.e., is codomain of a registered coercion or coerce embedding) and out-arrows (i.e., is domain of a registered coercion or coerce embedding). It is easy to keep track of this property while registering a coercion or coerce embedding.

But I actually think that your idea is better anyway:

A start might be to only version up on "embedding" creations.

It seems to me that this versioning would be complete under reasonable assumptions, based on the following lemma.

Lemma

Assume for all parents P, Q, P.register_coercion(Q) will be done in P.__init__(), but not later. Assume that parents R and S exist at time t_0, but there is no path from R to S in the coerce digraph at time t_0. Assume that between time t_0 and time t_1, .register_embedding() has never been called. Then, there is no path from R to S in the coerce digraph at time t_1.

Proof

Proof be contradiction. We assume that there is a path from R to S at time t_1. Hence, this path contains an arrow a that has been added to the coerce digraph after t_0. Since no register_embedding() has occurred between t_0 and t_1, all arrows created between t_0 and t_1 have been added by register_coercion(). But by hypothesis, register_coercion() is only called during __init__ of the codomain.

Hence, all arrows created between t_0 and t_1 end at parents created after t_0. Therefore, a path in the coerce digraph at time t_1 that contains a will necessarily end at a parent that has been created after t_0. This is a contradiction, since S has already existed at time t_0

q.e.d.

That might very well be usable. I expect "embeddings" to be relatively rare, and I think we can declare in our documentation they are not the preferred way of expressing relations (they are vary prone to creating non-commutative diamonds).

We could express in the documentation that one should call register_coercion only in the initialisation. Since register_embedding can only be called one time for each parent, I don't think we need to say that other methods of establishing a coercion are preferred.

simon-king-jena commented 11 years ago

Dependencies: #14711

simon-king-jena commented 11 years ago

comment:7

I hope you agree that we should start on top of #14711, because you suggest to change a couple of attribute names that are touched in #14711.

simon-king-jena commented 11 years ago

comment:8

Replying to @nbruin:

For example B._coerce_embeddings_from[A] = phi, in addition to A._coerce_embedding = phi. Since it is a MonoDict and thus has weak references to the keys, B would not prevent A from being garbage collected (but of course A would still keep B alive, because we registered the coerce embedding).

Yes, something along those lines. In fact, since _coerce_embeddings_from and _coerce_from_list would serve the same purpose from the perspective of coercion discoveries, I think the entries in _coerce_from_list should also be added to _coerce_embeddings_from, because it would simplify the code. By then it would make more sense to call the attribute _coerce_from_to_be_used_in_backtracking and the list that is now _coerce_from_hash could be _coerce_from_cached_results.

Or shorter: _coerce_from_backtracking and _coerce_from_cache.

By then we should probably be storing ALL maps there in weakened form.

Probably.

The only function of _coerce_from_list now is to keep domains alive, so it would be more economical to replace that by a list _domains_to_keep_alive where we can store strong references to the domains of the corresponding weakened maps that are now in _coerce_from_to_be_used_in_backtracking

I have introduced Parent._registered_domains for exactly this purpose, at #14711. So, we should extend its use.

P._codomains_to_keep_alive: List of codomains that should be kept alive as long as P lives. Normally, such codomains Q would have an entry in Q._coercion_maps[P]. Just storing a map in P._coerce_embedding would also have this effect.

Yes, and that's why I don't think we need P._codomains_to_keep_alive (I'd prefer the name P._registered_codomains). Even in weakened maps, we have a strong reference to the codomain.

simon-king-jena commented 11 years ago

comment:9

PS: We also have convert_from_list, and I guess we should proceed similarly for coerce_from_list.

simon-king-jena commented 11 years ago

comment:10

With the branch that I have just attached, one gets

sage: class pA(Parent): pass
sage: class pB(Parent): pass
sage: class pC(Parent): pass
sage: 
sage: A=pA(); B=pB(); C=pC()
sage: 
sage: BtoA=Hom(B,A)(lambda x: A(x))
sage: AtoC=Hom(A,C)(lambda x: C(x))
sage: A.register_coercion(BtoA)
sage: A.register_embedding(AtoC)
sage: C.coerce_map_from(B)
Composite map:
  From: <class '__main__.pB'>
  To:   <class '__main__.pC'>

        WARNING: This map has apparently been used internally
        in the coercion system. It may become defunct in the next
        garbage collection. Please use a copy.

What has been done in the patch:

renaming of attributes (_coerce_from_backtracking etc)
do not use a list, but use a MonoDict for all morphisms that are supposed to be fundamental for backtracking. There is a list _registered_domains, so that domains of registered coercions are kept alive by the codomain, but the domain of a registered embedding is not kept alive by the codomain.

Missing: Tests and versioning.

simon-king-jena commented 11 years ago

Branch: u/SimonKing/ticket/15303

nbruin commented 11 years ago

Commit: dcd8d68

nbruin commented 11 years ago

comment:12

Replying to @simon-king-jena:

PS: We also have convert_from_list, and I guess we should proceed similarly for coerce_from_list.

yes, conversion probably needs some attention too. However, since conversions exits between a lot more pairs of parents, Do we use backtracking to discover them? There are almost certainly loops: trivially, because ZZ->Z/nZ and Z/nZ->ZZ are valid conversions.

If we want a memory leak-free implementation I suspect having conversion without lifetime implications in either direction is required. Definitely material for another ticket.

Last 10 new commits:

`[changeset:dcd8d68]`	`Use registered embeddings for backtracking in the coercion model`
`[changeset:85cf7e8]`	`Merge branch 'ticket/14711' into ticket/15303`
`[changeset:364b985]`	`Add warning to string repr of weakened maps. Implement copying for all kinds of maps.`
`[changeset:5168cfd]`	`Generic copy method for maps, using _update_slots Use a cdef _codomain, since the codomain is strongly refed anyway Add doctests`
`[changeset:452d216]`	`Add docs to SchemeMorphism`
`[changeset:05fb569]`	`Change SchemeMorphism back (to cope with a Cython bug), copying the new code from sage.categories.map.Map`
`[changeset:8fd09d5]`	`Copying of PolynomialBaseringInjection and FormalCompositeMap`
`[changeset:be37145]`	`Let SchemeMorphism inherit from Morphism, not from Element`
`[changeset:0f38a2c]`	`Keep strong reference to codomain of weakened coerce maps Keep strong reference to domains of registered coercions`
`[changeset:a53261d]`	`Keep a strong reference to the codomain of PrecomposedAction`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 11 years ago

Changed commit from dcd8d68 to f837cbe

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 11 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

[changeset:f837cbe] Store a version number for the coercion cache, depending on the registered embeddings

simon-king-jena commented 11 years ago

comment:14

I have pushed a new commit that implements version numbers for the coercion cache, and I added a doctest along the following lines:

sage: class pA(Parent): pass
sage: class pB(Parent): pass
sage: class pC(Parent): pass
sage: A=pA(); B=pB(); C=pC()
sage: BtoA=Hom(B,A)(lambda x: A(x))
sage: AtoC=Hom(A,C)(lambda x: C(x))
sage: A.register_coercion(BtoA)
sage: C.coerce_map_from(B)
sage: A.register_embedding(AtoC)
sage: C.coerce_map_from(B)
Composite map:
  From: <class '__main__.pB'>
  To:   <class '__main__.pC'>

        WARNING: This map has apparently been used internally
        in the coercion system. It may become defunct in the next
        garbage collection. Please use a copy.

Hence, I think that the current commit solves our problem. In spite of the following "todo" list, I put it to "needs review" (but I am sure that further commits will be pushed soon).

TODO:

Add documentation, stating that registering of coercions should only be done during initialisation, and that registering of embeddings shouldn't be done too often.
Run all doctests---since playing with the coercion digraph is asking for trouble, I wouldn't be surprised about bad surprises :-P
Important: Get timings for examples that should be most sensitive against slow-downs in coercion.

simon-king-jena commented 11 years ago

Author: Simon King

simon-king-jena commented 11 years ago

comment:15

I already found that some test fails when s = SymmetricFunctions(QQbar).s() is done. "Of course", the error only occurs in sage -t src/sage/combinat/sf/sfa.py, but not if one just does s = SymmetricFunctions(QQbar).s() in an interactive session...

nbruin commented 11 years ago

comment:16

Important: Get timings for examples that should be most sensitive against slow-downs in coercion.

Looking at what happens for number fields, I have the impression that QQ._populate_coercion_lists_ is most frequently used to supply embeddings (I'm not getting many hits on direct calls to !register_embedding, see below)

The only two classes I have been able to identify that actually install embeddings are numberfields (and not all of them) and AbelianGroupWithValues_class

So, assuming that the version check itself is lightning-fast (and why shouldn't it be?) I expect that negatively affected examples would have to do a lot of number field or AbelianGroupWithValues creations interleaved with complicated coercion discovery that benefits a lot from knowing certain coercions do NOT exist.

There's AdditiveAbelianGroupWrapper which does an _unset_coercions_used together with a register_embedding (registering an "embedding" of the wrapper into the wrapped)

There's also UniversalCyclotomicField which registers an embedding into QQbar upon init, so that shouldn't really affect versioning.

simon-king-jena commented 11 years ago

comment:17

The following runs into an infinite loop:

sage: CF = CyclotomicField(5)
sage: UCF.<E> = UniversalCyclotomicField()
sage: CF = CyclotomicField(5,embedding=CC(exp(4*pi*i/5)))
sage: x = E(5)
sage: CC(x)

simon-king-jena commented 11 years ago

Work Issues: analyse recursion error

simon-king-jena commented 11 years ago

comment:18

Shorter:

sage: UCF.<E> = UniversalCyclotomicField()
sage: phi = copy(QQbar.coerce_map_from(UCF))
sage: x = E(5)
sage: phi(x)

Hence, we do have a map, but applying it will run into a loop.

simon-king-jena commented 11 years ago

comment:19

PS: Note that the failing map is a coerce embedding. Hence, it could be that I messed up things in a way that are not related with the cache version. Anyway, I just tested that weakening the coerce embedding has not been the culprit.

simon-king-jena commented 11 years ago

comment:20

Further boiled down:

sage: psi = QQbar.coerce_map_from(ZZ)
sage: psi(1)
<infinite loop>

simon-king-jena commented 11 years ago

comment:21

Aha!

With my current branch, the coerce map from ZZ to QQbar goes via the universal cyclotomic field, if one has created the universal field before putting the coercion into the cache.

Obvious reason: With my patch, both registered coercions and registered embeddings are used for backtracking, in order to fix the example of the ticket description. However, if there is a registered coercion, then I guess it should have preference over the registered embedding for backtracking. After all, the registered embedding is some kind of "forward", not "back".

So, modified suggestion: We should not put both registered coercions and registered embeddings into codomain._coerce_from_backtracking, but we should have two separate containers, and should look for the coerce embeddings only if no registered coercions are left.

Rationale: We suppose that registering coercions happens during initialisation of the codomain, whereas registering an embedding will happen after creation of the codomain. Since the backtracking starts at the codomain, we should have a clear preference for using those maps that are closely tied to the codomain---and these are the maps registered during creation of the codomain.

nbruin commented 11 years ago

comment:22

Replying to @simon-king-jena:

Rationale: We suppose that registering coercions happens during initialisation of the codomain, whereas registering an embedding will happen after creation of the codomain. Since the backtracking starts at the codomain, we should have a clear preference for using those maps that are closely tied to the codomain---and these are the maps registered during creation of the codomain.

I don't think that rationale is quite valid for the way it gets used. If we do

a+b

then we'll be investigating whether b can be coerced into the parent of a and whether a can be coerced into the parent of b. The original question doesn't actually specify a special role for the codomain (which is still up for choice!)

It seems to me that you want to argue (and your example backs you up in this) that while there can be multiple paths in the coercion graph, there are some paths that should be preferred over others. If we want to model this properly, we would have to include this information in the graph. One way would be to give each coercion a "cost" and we'd be looking for the lowest cost path (which could change with mutations of the graph!).

For your suggestion to fit in such a model, we would need that:

any path including an embedding is higher cost that a path without.
if a path has been discovered, new paths between the same parents that may arise by graph mutations are higher cost

It's not immediately clear to me that those assumptions are valid.

Note: what we're trying to solve on this ticket hasn't actually caused problems in real life, partly because embeddings are indeed rare. I could see the temporary outcome of this ticket being "yes, this is a potential problem, here are a few things we could do to fix it, but the cost of doing this is considerable. We'll revisit this if we run into a real-life situation that is actually affected by it".

The current approach is basically one where we ignore the cost of the last arrow, and otherwise make the cost of an embedding infinity (which gives a rather odd cost measure, but leads to somewhat desirable behaviour as we're finding out now)

simon-king-jena commented 11 years ago

comment:23

The example of QQbar.coerce_map_from(ZZ) fails as follows.

We want that the coercion is determined by QQbar._coerce_map_from_(ZZ) (which returns True). But apparently (with my current patch, at least), the backtracking has preference over what is returned by _coerce_map_from_. I don't understand the reason, yet.

simon-king-jena commented 11 years ago

comment:24

Aha! As it turns out, the "coerce costs" of the direct map are higher than the coerce costs of the composite map that goes via the universal cyclotomic field:

sage: UCF.<E> = UniversalCyclotomicField()
sage: psi = QQbar.coerce_map_from(ZZ)
sage: psi._coerce_cost
30

versus

sage: psi = QQbar.coerce_map_from(ZZ)
sage: psi._coerce_cost
100

EDIT: ... and this explains why the coercion model does not rely on the result of QQbar._coerce_map_from_(ZZ) (returning a map of cost 100), while the composite map only has the cost 30

simon-king-jena commented 11 years ago

comment:25

The coerce costs of a (generic) map with exact domain and codomain are 10, but 10000 if one of them is inexact. However, the coerce costs of a DefaultConvertMap are between 100 and 400. And it seems to me that this is not well balanced.

Namely, if the coercion model has to choose between a single default convert map and a composition of 10 generic maps, it would take the composition---which would of course normally be slower than a single map!

This decision has been made 5 years ago. Perhaps the coerce costs should be refactored?

It makes sense that the coerce costs for inexact (co)domain will be high. But a single map should certainly be better than a composite map with the same domain and codomain.

simon-king-jena commented 11 years ago

comment:26

Robert, perhaps you can comment on the reasons for choosing the coerce costs for DefaultConvertMap and friends? Why are the costs so much higher than for a generic map?

simon-king-jena commented 11 years ago

comment:27

It seems to me that the ._coerce_costs of a map are largely ignored. Namely, in .discover_coercion(), we have a parameter num_paths: If num_paths paths in the coercion graph are found by backtracking, then the path with the least ._coerce_costs is returned. But since num_paths=1, in fact the first found coercion is returned.

The only exception is the user provided morphism. Here, the rule is: Even if the user provides a coerce morphism, then one still searches a coercion by backtracking, and in case of success the coerce costs of the user-provided morphism is compared with the other morphism.

This just isn't fair!!! I believe that the user-provided morphism should at least have the same priority as a morphism found by backtracking.

Two approaches, that are not mutually exclusive:

Amend the coerce costs, so that a simple map has less costs than a composite map, unless there is a very good reason.
Let the user-provided morphism count for the number of morphisms found in .discover_coercion(). Hence, if num_paths=1 and the user provides a morphism, then the maximal number of paths-to-be-considered is attained and hence the user-provided morphism is returned without backtracking.

I think the second point is more important, and I will implement it now.

simon-king-jena commented 11 years ago

comment:28

Another issue we might eventually tackle: Connecting paths in the coerce digraph are found by depth first search. We might consider to replace it by a breadth first search. On a different ticket, perhaps, if there is interest.

simon-king-jena commented 11 years ago

comment:29

Replying to @simon-king-jena:

Let the user-provided morphism count for the number of morphisms found in .discover_coercion(). Hence, if num_paths=1 and the user provides a morphism, then the maximal number of paths-to-be-considered is attained and hence the user-provided morphism is returned without backtracking.

I think the second point is more important, and I will implement it now.

Note one remark in the code:

                # If there is something better in the list, try to return that instead
                # This is so, for example, _coerce_map_from_ can return True but still
                # take advantage of the _populate_coercion_lists_ data.

The _populate_coercion_lists_ data are also put into _coerce_from_cache. Hence, these data will have priority over _coerce_map_from_ in self.coerce_map_from(other). Therefore, self.discover_coerce_map_from(other) will only be called if there are no relevant _populate_coercion_lists_ data.

Hence, I believe that the remark is not valid, and will remove it.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 11 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

[changeset:74821fe] Give user-provided coerce maps the same priority than to a coerce map found by backtracking

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 11 years ago

Changed commit from f837cbe to 74821fe

simon-king-jena commented 11 years ago

Changed work issues from analyse recursion error to none

simon-king-jena commented 11 years ago

comment:31

The intermediate problem has been added as a doctest.

simon-king-jena commented 11 years ago

Work Issues: Analyse recursion error

simon-king-jena commented 11 years ago

comment:32

As it turns out, the last commit fixes one error, but the more problematic errors persist.

simon-king-jena commented 11 years ago

comment:33

The following sequence of commands results in a "recursion depth exceeded" (some other errors are expected, the example is taken from the doctests):

Sym = SymmetricFunctions(QQ)
Q = Sym.kBoundedQuotient(3,t=1)
Q
km = Q.km()
km
F = Q.affineSchur()
F(km(F[3,1,1])) == F[3,1,1]
km(F(km([3,2]))) == km[3,2]
F[3,2].lift()
F[2,1]*F[2,1]
F[1,2]
km[2,1]*km[2,1]
HLPk = Q.kHallLittlewoodP() 
HLPk[2,1]*HLPk[2,1]
dks = Q.dual_k_Schur()
dks[2,1]*dks[2,1]
Q = Sym.kBoundedQuotient(3)
Sym = SymmetricFunctions(QQ['t'].fraction_field())
Q = Sym.kBoundedQuotient(3)
km = Q.km()
F = Q.affineSchur()
F(km(F[3,1,1])) == F[3,1,1]
km(F(km([3,2]))) == km[3,2]
dks = Q.dual_k_Schur()
HLPk = Q.kHallLittlewoodP()
dks(HLPk(dks[3,1,1])) == dks[3,1,1]
km(dks(km([3,2]))) == km[3,2]

Trying to minimise the example now...

simon-king-jena commented 11 years ago

comment:34

PS: Some of the commands seem to have a rather sluggish behaviour. Perhaps the example is also good for detecting a regression?

simon-king-jena commented 11 years ago

comment:35

I observe that the example sometimes works and sometimes fails. This seems to indicate that one "randomness" in my code is problematic.

Namely, with the current commits, the coercions that are used by the backtracking algorithm are stored in a MonoDict (codomain._coerce_from_backtracking). The hash depends on the id of the keys and is thus susceptible to change between different runs of Sage, and therefore the order in which the arrows of the coerce digraph are considered in the backtracking algorithm also changes between different runs of Sage.

I am not sure what I shall think of it. My first impulse: It does not matter what map is returned as result of backtracking, but in any case it must be a valid map. In the example of comment:32, it seems that some arrows in the coerce graph make assumptions on other arrows of the coerce graph (therefore the recursion), which is a bug.

simon-king-jena commented 11 years ago

comment:36

I think I see a potential problem. In sage.combinat.sf.k_dual.DualkSchurFunctions.__init__, there is

kHLP = kBoundedRing.kHallLittlewoodP()
self.module_morphism(self._dks_to_khlp_on_basis,codomain=kHLP).register_as_coercion()

Hence, a coercion is registered on kHLP after it is initialised. As we have seen in the proof of my Lemma, such registration should result in an increase of the cache version number.

Hence, I will change Map.register_as_coercion() so that it increases the cache version number.

simon-king-jena commented 11 years ago

comment:37

Replying to @simon-king-jena:

Hence, I will change Map.register_as_coercion() so that it increases the cache version number.

I think it should be done, but it turns out that it does not fix the problem.

simon-king-jena commented 11 years ago

comment:38

I found that the example above has the following structure:

We have parents A and B. We search a coercion A -> B.
B stores coercions (for backtracking) from parents C and D. Only C is registered by B.register_coercion(...), whereas the coercion from D was registered in a different way, after initialisation of B.
There is a coercion phi from A to C and a coercion psi from A to D. Sage knows that phi and psi exist. However, calling psi internally relies on a coercion from A to B.
The coercion from A to B must go via C, starting with phi. Otherwise, if one tries to coerce A via D into B, calling psi (as part of a composite coerce map) will result in an infinite recursion.

Problem with the current commits

It currently depends on a random choice whether the backtracking algorithm first goes back from B to C (and finds a good coerce map from A) or from B to D (and finds a map from A that crashes).

We need to give a hint to the coercion system that it should try C first. In vanilla Sage, this is easy, because the registered coercions are in a list, and the maps registered first are chosen first. But with my current patch, all registered coercions and coerce embeddings are stored in a MonoDict. Hence, the order of registration has no implication on the order of execution.

Suggested solution

This brings me back to the idea to keep a list of registered coercions, that would automatically keep the domain of registered coercions alive, and a MonoDict of other coercions used for backtracking (e.g. registered embeddings), whose domains are not necessarily kept alive.

In discover_coercion, we would then first consider the registered coercions, and only if their study is complete would the backtracking turn towards the other coercions.

Perhaps useful

We have self.register_coercion(mor), that adds a coercion to the list of coercions that the backtracking algorithm is starting with. In addition, we might have a method self.additional_coercion(mor), that will put mor only in the MonoDict of coercions that the backtracking algorithm is considering only later.

Would this be useful? If "yes", what lifetime implications do we want? Shall the domain of an "additional coercion" be kept alive?

nbruin commented 11 years ago

comment:39

Replying to @simon-king-jena:

I found that the example above has the following structure:

We have parents A and B. We search a coercion A -> B.

B stores coercions (for backtracking) from parents C and D. Only C is registered by B.register_coercion(...), whereas the coercion from D was registered in a different way, after initialisation of B.

There is a coercion phi from A to C and a coercion psi from A to D. Sage knows that phi and psi exist. However, calling psi internally relies on a coercion from A to B.

The coercion from A to B must go via C, starting with phi. Otherwise, if one tries to coerce A via D into B, calling psi (as part of a composite coerce map) will result in an infinite recursion.

It may well be that you can hide the problem by enforcing some lookup order, but you'd be introducing data (in this case list order) that is not part of our model. Hence, you'll end up relying on essentially unspecified (well, implementation-specified) behaviour. That makes it very hard to reason about how the system should behave and hence to determine (in the future) whether something is a bug. It may well be (in the end) that enforcing a lookup order turns out to be required to keep performance, but it would probably be preferable to find a better solution.

As you observe, the coercion A->D is in actuality derived from A->B, but apparently not by a straightforward composition. However, doesn't that mean that if A->D is discovered, then A->B must be discovered already? If A->D relies on A->B, I think part of registering A->D is ensuring that A->B is discovered.

Since it seems we need a tie-breaker for multiple paths anyway, and there is a cost system (which I think was primarily introduced for pushout construction!), it is then just a matter of ensuring that the cost of A->D is higher than the cost of A->B. Then anything that can be done from A and then via B or D will prefer the direct path A->B.

So it seems to me that the "theoretically" better solution (which may be prohibitive cost-wise!) would be:

recalibrate some of the costs
take the costs properly into account
probably do so via breadth-first (for finding shortest path, it allows much better pruning)

I think that should solve the A->D problem as well.

simon-king-jena commented 11 years ago

comment:40

Replying to @nbruin:

It may well be that you can hide the problem by enforcing some lookup order, but you'd be introducing data (in this case list order) that is not part of our model.

How to formulate a model that takes into account dependencies between arrows of a digraph? In particular between arrows that may not even be adjacent?

As you observe, the coercion A->D is in actuality derived from A->B, but apparently not by a straightforward composition.

Correct. When calling the coercion A->D, some arithmetic is performed on underlying data, this involves coercion, and this is when A->B occurs.

However, doesn't that mean that if A->D is discovered, then A->B must be discovered already?

No. You can easily define a map and then compose with other maps, without calling it. A->B is only needed for calling A->D, but not for definining A->D.

If A->D relies on A->B, I think part of registering A->D is ensuring that A->B is discovered.

Yes, and how to model this?

Since it seems we need a tie-breaker for multiple paths anyway, and there is a cost system (which I think was primarily introduced for pushout construction!), it is then just a matter of ensuring that the cost of A->D is higher than the cost of A->B. Then anything that can be done from A and then via B or D will prefer the direct path A->B.

... but only if, in addition, we actually use the coerce costs in discover_coercion(), rather then returning the first map that came across.

So it seems to me that the "theoretically" better solution (which may be prohibitive cost-wise!) would be:

recalibrate some of the costs

take the costs properly into account

probably do so via breadth-first (for finding shortest path, it allows much better pruning)

I think that should solve the A->D problem as well.

Perhaps.

Breadth first search might be a good idea. I don't know how to efficiently implement it, but I guess this should be done at some point.
Calibrating the costs, such that A->C->B is guaranteed to have less cost than A->D->B for any example in which A->C is independent of but A->D is dependent on A->B, requires a case-by-case analysis. Not very practical, and moreover choosing parameters such that "it works" is kind of arbitrary. This is not a very clean solution either.

Before we have implemented breadth first search, I believe it would be more practical to say that we should

give preference to maps registered during initialisation (P.register_coercion(mor))
give less preference to maps registered after initialisation (mor.register_as_coercion())
give least preference to registered embeddings (I am talking about backtracking here, not pushout).

This is practical because this approach is close to what is done in vanilla Sage. The current registration order is such that "it works".

nbruin commented 11 years ago

comment:41

Replying to @simon-king-jena:

If A->D relies on A->B, I think part of registering A->D is ensuring that A->B is discovered.

Yes, and how to model this?

In this case it could just be done by explicitly asking the system to discover A->B before installing A->D. However, proper cost accounting shouldn't require discovery order to matter at all.

Breadth first search might be a good idea. I don't know how to efficiently implement it, but I guess this should be done at some point.

Just keep track of all discovered shortest paths (ordered by domain) into the codomain, and try to extend the shortest path. Merge the new-found paths (only store them if you found a shorter path or a path from a new domain). As soon as you found a path from the required domain, prune by cost (you only need to store paths that could possibly lead to a cheaper path). Continue until all paths you have stored have been examined (and further path is longer than what you've found). The expensive part is still determining there is NO path, because no pruning can occur then.

Calibrating the costs, such that A->C->B is guaranteed to have less cost than A->D->B for any example in which A->C is independent of but A->D is dependent on A->B, requires a case-by-case analysis. Not very practical, and moreover choosing parameters such that "it works" is kind of arbitrary. This is not a very clean solution either.

That is right, but it's not a very clean problem either. You need something to distinguish A->C->B from A->D->B. The only place where we have enough information to insert this data is when we define A->D. Since that coercion requires A->B, you could encode its cost to be cost(A->B) + . This is basically what path composition would do for you normally. Since A->D is derived from A->B, you'll have to insert that information in some other way.

This is practical because this approach is close to what is done in vanilla Sage. The current registration order is such that "it works".

If you want a practical solution, I think leaving the status quo for now is the most practical. It's working now. Remember that the issue in the ticket has not occurred in a practical example (yet). The main purpose of this ticket as I see it is to document that we are not implementing the model we think we're implementing. Changing it to another implementation that also doesn't behave as our model isn't necessarily progress. It only is if we're getting desired behaviour in a case that actually matters.

The current registration order is such that "it works".

Hm, that may be backwards. Sage currently works because it's built with the coercion discovery as it is at the moment. Patches wouldn't have made it in if they didn't work (this is certainly true for doctest examples. There can be other examples (future bug reports) out there that may fail.

We can of course try to rationalize the current model. It seems to me we may be borrowing from the fact that the time line has no cycles (basically the arrow of causality), so preferring maps that were put in "earlier" should tend to not depend on later inserted maps.

Of course, this flies in the face of the model we're claiming to implement: Properties of a digraph don't rely on the history of the digraph. They only rely on the state of the graph.

If you want a history-sensitive graph, you could put as cost of the edge the insertion time of the edge, and use the maximum over a path as the cost of the path. Do you think that's a reasonable model?

simon-king-jena commented 11 years ago

comment:42

Sidenote: I tried to make it so that the registered coercions are studied first, in backtracking. It seems that it did fix the failing example above. However, further recursion errors remain.

sagemath / sage

Coercion discovery fails to be transitive #15303