Open mkoeppe opened 4 years ago
Help with implementing this would be very welcome!
That's a nice idea, +1!
Data associated with symmetries, computed currently each time for each
CompWithSym
object in methods such as__init__
,__add__
,trace
, ... can then be precomputed and cached in the parent (for example using@
cached_method in the parent class).
I think it would be a good idea to cache results from trace
, contract
, ... as well.
The most annoying thing with this ticket will most likely be the docstring that has to be adapted...is there a good way to use search&replace?
Currently, the index generators and manipulators take (mostly) lists and can therefore not be cached. To use the caching most efficiently, I suggest we switch entirely to tuples.
Branch: public/30307
Commit: 943f9ea
Before I go on with this ticket, could you please take a look whether this meets your rough idea?
Is anyone willing to work on the docstrings...? Perhaps there's an efficient way to do this?
Branch pushed to git repo; I updated commit sha1. New commits:
a2514b6 | Trac #30307: no symmetries -> CompParent without symmetries |
Replying to @mjungmath:
Before I go on with this ticket, could you please take a look whether this meets your rough idea?
Yes, this is going in the direction that I had in mind.
Some more of the normalization happening in CompParentWithSym.__init__
should probably be moved to __classcall_private__
What would be a proper category for the parent?
Replying to @mkoeppe:
Some more of the normalization happening in
CompParentWithSym.__init__
should probably be moved to__classcall_private__
For example?
For example the order of the component indices does not matter
Replying to @mjungmath:
What would be a proper category for the parent?
Take the following with a grain of salt (this is just a rough/naive thought): It seems to me that the current proposal is a kind of abuse of Sage's parent/element scheme, for CompParent
does not correspond to a "genuine" mathematical object (hence maybe your question...). In particular, it does not know about the ring on which the components are based. I do not deny that a reorganization of Components
would be a good thing, but maybe at the class level, not at the parent/element level, the issue here being mostly the storage of metadata.
Just use the category of sets. I don't think this is an abuse.
Think of an instance of CompParent
as the set of all possible Components
instances that have the specified symmetry.
By using the parent/element framework, you will get coercion for free - so elements with a coarser symmetry will be in
a finer parent.
Replying to @mkoeppe:
For example the order of the component indices does not matter
Right!
Replying to @mkoeppe:
Just use the category of sets. I don't think this is an abuse.
Think of an instance of
CompParent
as the set of all possibleComponents
instances that have the specified symmetry.By using the parent/element framework, you will get coercion for free - so elements with a coarser symmetry will be
in
a finer parent.
For me, it doesn't feel like an abuse either. Besides, I am sure you can make that notion of compontents mathematically rigorous. But I am not sure whether this becomes a well-defined set then... What about the category of objects?
Alright, I make progress here. I changed only the backend such that most doctests should still pass, and the module can be used exactly as before. The elementary examples already passed.
I'll push my branch tomorrow.
I am looking forward to some benchmarks as soon as this branch is ready. :)
I can understand why this might seem like an abuse because it is a set of objects, but by extension, all of the combinatorial objects would be an abuse as well. So even though we will not be putting an extra mathematical structure on the components, this is still a valid use of the framework because there is a set of objects (the parent) and the individual objects (the elements).
Something to consider, however, is a potential speed penalty for the extra levels of initialization. This can be somewhat mitigated by using Cython, but performance regression testing is warranted here.
Replying to @tscrim:
This can be somewhat mitigated by using Cython, but performance regression testing is warranted here.
How so?
Replying to @mjungmath:
Replying to @tscrim:
This can be somewhat mitigated by using Cython, but performance regression testing is warranted here.
How so?
Because Cython is faster than Python, including potential benefits from direct C-level function calls.
Yes, that's clear. I meant, how to implement? Make CompParent
and Components
both simply extension types?
Would cdpef
ing the symmetry related functions cause a speedup btw?
Since this involves only integers, i.e. struct types, one could even try to Cythonize these completely.
Branch pushed to git repo; I updated commit sha1. New commits:
fc22d5d | Trac 30307: Cythonize, restructuring, factory method to resemble old behavior |
Replying to @tscrim:
I can understand why this might seem like an abuse because it is a set of objects, but by extension, all of the combinatorial objects would be an abuse as well. So even though we will not be putting an extra mathematical structure on the components, this is still a valid use of the framework because there is a set of objects (the parent) and the individual objects (the elements).
Thank you Matthias, Michael and Travis for your replies. I'm convinced now that parent/element is a good approach here (I was bothered by the lack of algebraic structure of the parent), especially for symmetry-based coercions.
Something to consider, however, is a potential speed penalty for the extra levels of initialization. This can be somewhat mitigated by using Cython, but performance regression testing is warranted here.
Certainly!
New commits:
fc22d5d | Trac 30307: Cythonize, restructuring, factory method to resemble old behavior |
Branch pushed to git repo; I updated commit sha1. New commits:
27bc9b1 | Trac #30307: pdx file + __init__.py removed |
A possibly relevant ticket, about enhancing symmetries of components beyond the currently implemented ones: #28813
New commits:
27bc9b1 | Trac #30307: pdx file + __init__.py removed |
I am sorry for the mess. However, this is my rough approach. I separated elements and parents, and established factory methods to recover the old behavior (backwards compatibility and less work on the docstrings).
I am open to suggestions how to separate the symmetry code apart. Some symmetry treatments are coupled to the element for optimization (e.g. antisymmetrize
to determine zero more quickly).
This code is too big for me to do it alone. I'd appreciate some help here. I hope my first approach is of some use for you.
A concern about cythonizing the whole thing: the Python debugger cannot be used in Cython parts of codes, which is a pity, IMHO. So if Cython does not bring any significant performance improvement, we are loosing more than we gain.
Also shouldn't one perform a single task in this ticket, namely refactor Components
into parent/element, and leave cythonization to another ticket?
Replying to @egourgoulhon:
Also shouldn't one perform a single task in this ticket, namely refactor
Components
into parent/element, and leave cythonization to another ticket?
That makes sense. In the next step one could try to implement #28813, and perhaps even use Cython to do so.
Besides, I just learned that one could increase the init speed with Python 3 using __slots__
. Maybe that's what we should do for now.
Still, I don't know how we should separate the symmetry code properly. My idea so far was to let the parent do the symmetrizing work and return the parent with the corresponding symmetries, and construct an element from it, then assign the components. But this doesn't work so well when the element itself such as in antisymmetrize
is taken into account.
Alternatively, it would be nice do define maps (morphisms) between parents doing that work and cache those morphisms. But that is something I don't know how to do in a nice and proper way. For example, if one antisymmetrizes in the variables where the components are already symmetric, it's simply the "zero morphism".
Nevertheless, I think that the rough refactoring into comp_element
and comp_parent
(and comp
to recover the old behavior) is already a good way to go.
Replying to @egourgoulhon:
A possibly relevant ticket, about enhancing symmetries of components beyond the currently implemented ones: #28813
I agree that it would be a good idea to keep possible generalizations in mind while doing this refactoring. (Also #30276)
Definitely! So, do we have a roadmap?
This is looking nice already, and I agree that this ticket should do a Python implementation only. I would hope that there are already performance gains by caching.
@mkoeppe: What do you say about the morphism idea in comment:27?
@tscrim: What about using __slots__
in this ticket instead of Cythonizing?
Regrading the morphisms: These are exactly the reduce
, retract
, lift
maps that we discussed in #30229 - just on the level of components rather than modules.
Replying to @mkoeppe:
Regrading the morphisms: These are exactly the
reduce
,retract
,lift
maps that we discussed in #30229 - just on the level of components rather than modules.
Mh...but I suggest more than 3 morphisms. I don't know what you mean, sorry.
Can I take it as a "yes, what a wonderful idea!"? :P
Yes, there are many maps:
CompParent
with a coarser symmetry, there is an injection (lift
map) to any CompParent
with a finer symmetry.
etc.
(And yes, it's a wonderful idea.)I don't expect __slots__
to provide much benefit to speed as from what I am reading, it is more useful for memory usage. The main thing is to turn the file into a .pyx
file with a .pxd
header for declarations. The element class should be a cpdef
with the main parameters declared and given explicit typing. These are relatively easy things to do provided you don't have multiple inheritance in the element classes (the parent can remain normal Python classes in a pyx file).
IMO, you also don't loose too much with converting to Cython with debugging because you still get a lot with error tracebacks. The biggest thing I have found lost is the ease of profiling to find bottlenecks. However, testing with practical examples can get around that a bit, and there is a tool to profile Cython code IIRC. So I generally see this as a smaller trade-off.
Well, __slots__
makes initializations faster because attribute access is significantly faster (up to 30%). Memory is just an additional benefit.
Even if we turn Components
into a cpdef
, I think we still gain a good performance boost with __slots__
which can sum up quickly with recurrent use of _comp
.
To use morphisms, we need a new category (say Category of collections of abstract components
) and a homset, right?
Btw, I am not convinced that cpdef
ing the class makes initializations faster, I'd bet on the contrary. When I understand it correctly, cpdef
creates two classes in the background, an extension type and a usual Python class. Since we don't expect a benefit during the lifetime of an instance, it should even slowdown things. But maybe I just got things wrong here.
What are you talking about cpdef
ing a class? You only do that to functions AFAIK. I am still not 100% convinced by the __slots__
approach, more so because it makes it (slightly) harder to switch from a Python class to an extension class IIRC.
I don't see why we need a new category. We don't need to be so heavy-handed with the approach. Having the homset be in the category of (enumerated?) sets is sufficient IMO.
The main benefit with Cython and extension classes is being able to strongly type things to have as many C level function calls and code as possible.
Replying to @tscrim:
I don't see why we need a new category. We don't need to be so heavy-handed with the approach. Having the homset be in the category of (enumerated?) sets is sufficient IMO.
Mathematically, I am still not convinced that our parents constitute sets. Their elements run over all possible rings (and "frames"), which is not a set either.
Replying to @tscrim:
The main benefit with Cython and extension classes is being able to strongly type things to have as many C level function calls and code as possible.
Indeed. That needs a thorough modification of the current code. The indices checks use mostly lists and Python sets. The components are stored as dictionaries which cannot be strongly typed. I agree, that should eventually be done, however I propose that's something for another ticket.
For now, we can use __slots__
for a slight speedup and remove it again when we Cythonize. That's what I meant.
I don't really like this partial measure. I wouldn't do it and just keep it pure Python until you actually decide to make it Cython.
Why is the ring associated with the element and not the parent? In general, why is all of these attributes copied from the parent? Is the extra level of indirection that slow? You can just call self._parent
in the Cython code.
Because of how your current implementation is done, you are not going to benefit from coercions. Nor do you seem to be using anything in a category. Thus, I would actually weaken things and not use Sage's Parent/Element classes. Instead, I would just mimic them. Perhaps I am misunderstanding how you plan to apply these.
There is still a lot of you things you can do to tell Cython to make things be strongly typed even if they are extracted from lists/sets/dicts, such as type-casting.
Please don't look at the details yet. The code is far away from being finished. Coercions come, attributes will be removed (it's just to make parts of the code already debuggable).
Why has the parent no ring? Because the symmetries and indices simply do not depend on it. On the level of indices and symmetries we have: different ring -> same data -> same instance -> more effective storage.
Currently every
Components
object has lots of metadata attributes in addition to the actual data dictionary in._comp
.If one has many different
Components
objects with the same symmetry metadata, we can reduce storage space as follows.We create new classes
CompParent
,CompParentWithSym
, ..., which store the symmetry attributes and becomeUniqueRepresentation
. We makeComponents
objects elements of these parents.The parents will also have index iterator methods.
Data associated with symmetries and dimensions, computed currently each time for each
CompWithSym
object in methods such as__init__
,__add__
,trace
, ... can be precomputed and cached in the parent using@cached_method
in the parent class).This will make the code in #30229 (subspaces of tensor with symmetries) more elegant because it no longer needs a dummy
Components
object to represent the symmetry but rather aCompParent
object.See also:
32029 Action of a sympy
TensorSymmetry
Depends on #31904 Depends on #29619
CC: @mjungmath @egourgoulhon @tscrim @honglizhaobob
Component: linear algebra
Author: Michael Jung, Matthias Koeppe
Branch/Commit: public/30307 @
e5a81f2
Issue created by migration from https://trac.sagemath.org/ticket/30307