sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.23k stars 429 forks source link

Cythonize CombinatorialFreeModuleElement #22632

Closed tscrim closed 7 years ago

tscrim commented 7 years ago

We move CombintorialFreeModuleElement to its own class IndexedFreeModuleElement and cythonize it for (future) speed gains.

CC: @sagetrac-sage-combinat @nthiery @jdemeyer @fchapoton

Component: performance

Keywords: days85

Author: Travis Scrimshaw

Branch: 20cd0da

Reviewer: Nicolas M. Thiéry

Issue created by migration from https://trac.sagemath.org/ticket/22632

tscrim commented 7 years ago

Branch: public/combinat/cythonize_CFM_element-22632

tscrim commented 7 years ago

New commits:

e918674Moving CombinatorialFreeModuleElement to own (Cython) file.
tscrim commented 7 years ago

Commit: e918674

hivert commented 7 years ago
comment:3

I Travis,

I'm curious, is there any performance gain. I had the impression that storing everything in python dict simply kills all possibilities of optimization ?

We discussed during past sage days, to re-implement combinatorial free modules, using arrays with a global dict for all object used only during input and output and not during linear algebra.

Florent

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from e918674 to 6b9d88b

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

6b9d88bDoing some additional tweaks.
tscrim commented 7 years ago
comment:5

There seems to be a little bit with doing basic operations:

sage: C = CombinatorialFreeModule(QQ, [1,2,3])
sage: x = sum(C.basis())
sage: %timeit [x + x for _ in range(1000)]
100 loops, best of 3: 2.57 ms per loop

vs old:

sage: C = CombinatorialFreeModule(QQ, [1,2,3])
sage: x = sum(C.basis())
sage: %timeit [x + x for _ in range(1000)]
100 loops, best of 3: 2.94 ms per loop

While it's not much, it is something like a 10-15% speedup, but it is something likely to be called a significant number of times. It's even more so for scalar multiplication:

sage: s = SymmetricFunctions(ZZ).s()
sage: x = s.an_element()
sage: %timeit [4*x for _ in range(1000)]
100 loops, best of 3: 2.49 ms per loop

vs old:

sage: %timeit [4*x for _ in range(1000)]
100 loops, best of 3: 3.42 ms per loop

I might even be able to get some more speed if I am smarter in how we create the results of, e.g., _add_.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f1fcdd9Do not go through _from_dict but directly create an element.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 6b9d88b to f1fcdd9

tscrim commented 7 years ago
comment:7

That gives a very nice speed boost:

sage: C = CombinatorialFreeModule(QQ, [1,2,3])
....: x = sum(C.basis())
....: %timeit [x + x for _ in range(1000)]
....: 
1000 loops, best of 3: 1.84 ms per loop
sage: s = SymmetricFunctions(ZZ).s()
sage: x = s.an_element()
sage: %timeit [4*x for _ in range(1000)]
100 loops, best of 3: 2.05 ms per loop

Now those Python calls become relatively expensive, so we should avoid them.

tscrim commented 7 years ago
comment:8

Frédéric, I'm cc-ing you because you might want to note that I am taking care of the __cmp__ issue here as well.

fchapoton commented 7 years ago
comment:9

many broken doctests..

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

1d4aa1cFixing unpickling issues.
4ee3313Fixing Cython not inheriting `_rmul_` and `_lmul_` from different signatures.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from f1fcdd9 to 4ee3313

tscrim commented 7 years ago
comment:11

Fixed the pickling issues and the other failures (incompatible signatures for _lmul_ and _rmul_).

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 4ee3313 to d87735d

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

d87735dFixed trivial doctest failure.
tscrim commented 7 years ago
comment:13

Patchbot is green (essentially, doctest failure is independent).

tscrim commented 7 years ago
comment:14

ping

nthiery commented 7 years ago
comment:15
diff --git a/src/doc/en/thematic_tutorials/tutorial-objects-and-classes.rst b/src/doc/en/thematic_tutorials/tutorial-objects-and-classes.rst
index 3f964bf..749ab5f 100644
--- a/src/doc/en/thematic_tutorials/tutorial-objects-and-classes.rst
+++ b/src/doc/en/thematic_tutorials/tutorial-objects-and-classes.rst
@@ -285,8 +285,7 @@ Some particular actions modify the data structure of ``el``::
        sage: el.__class__
         <class 'sage.combinat.free_module.CombinatorialFreeModule_with_category.element_class'>
        sage: el.__dict__
-       {'__custom_name': 'foo',
-        '_monomial_coefficients': {[1, 2, 3]: 1, [1, 3, 2]: 3}}
+       {'__custom_name': 'foo'}

Plausibly the explanation around that change needs to be updated a bit: there remains no "normal" attribute in this example. Maybe el.foo=1 could be issued before, and a comment added after that _monomial_coefficients is a Cython attribute?

In general, this seems to be calling for changing the running example for this tutorial. The current one is a bit complicated. But that's for a later ticket.

nthiery commented 7 years ago
comment:16
@@ -242,7 +238,8 @@ class FreeAlgebraElement(AlgebraElement, CombinatorialFreeModuleElement):
             if self_on_left:
                 return Factorization([(self, 1)]) * scalar
             return scalar * Factorization([(self, 1)])
-        return super(FreeAlgebraElement, self)._acted_upon_(scalar, self_on_left)
+        ret = super(FreeAlgebraElement, self)._acted_upon_(scalar, self_on_left)
+        return ret

What's the point?

nthiery commented 7 years ago
comment:17

As you notice, I am going through the changes. Up to those details, it looks good :-)

nthiery commented 7 years ago
comment:18

About _divide_if_possible_: since we are moving it anyway, shall we use the occasion to put it somewhere meaningful? Maybe as a method in EuclideanDomains, as x._divide_if_possible(r), next to quo_rem which it depends on?

nthiery commented 7 years ago
comment:19

I would move the comment about handling old pickles in the doc of CombinatorialFreeModule.Element.__setstate__, with a trac ref. Please also add a brief comment and trac ref for the unpickle_override at the end of the file.

nthiery commented 7 years ago
comment:20

Could you elaborate a bit why we need CombinatorialFreeModule.Element to be a subclass of IndexedFreeModuleElement? In particular, can you provide a minimal non-working example?

nthiery commented 7 years ago
comment:21
# TODO: move the content of this class to CombinatorialFreeModule.Element and ModulesWithBasis.Element
cdef class IndexedFreeModuleElement(Element):

The TODO could be updated :-)

nthiery commented 7 years ago
comment:22

I am wondering whether the printing methods (_sorted_items_for_printing, latex, repr, asciiart, ...) should be moved to Modules.WithBasis.ElementMethods. We are moving them anyway, and I believe they would be a reasonable default implementation. That would require providing a default value for parent._print_options though, and deciding for a generic idiom to run through the items.

What do you think?

We could also keep that for a later ticket.

nthiery commented 7 years ago
comment:23

Instead of sage.modules.with_basis.indexed_free_module_element, we could use the shorter sage.modules.with_basis.element. That would be consistent with the nearby morphism module. I don't foresee other element classes in this module that would not be element classes for IndexedFreeModule.

nthiery commented 7 years ago
comment:24

Users will have code importing CombinatorialFreeModuleElement. We should support them with a deprecation alias "Please use CombinatorialFreeModule.Element or IndexedFreeModuleElement."

If we wan't to be even kinder to our users, helping them prepare for the next step, we could have sage.modules.with_basis.indexed_free_module.IndexedFreeModule (or just .free_module.?) be an alias for CombinatorialFreeModuleElement. Then the above message warning would suggest IndexedFreeModule.Element.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from d87735d to f66cb0d

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

fbefa66Merge branch 'public/combinat/cythonize_CFM_element-22632' of git://trac.sagemath.org/sage into public/combinat/cythonize_CFM_element-22632
68cdf67Fixing things from testing.
99c99bfMoving _divide_if_possible to EuclideanDomains.
cafb215Remove map_coefficients call in __truediv__.
a8eeca6Remove comment and deprecate CombinatorialFreeModuleElement.
6b96942Rename indexed_free_module_element to indexed_element.
f66cb0dUpdating tutorial.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

e0553beAdding comments about pickles.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from f66cb0d to e0553be

tscrim commented 7 years ago
comment:27
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from e0553be to b21fb16

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

b21fb1622632: just a touch of polishing on the thematic tutorial change
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

b13a87022632: use trac role instead of full url in doc
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from b21fb16 to b13a870

nthiery commented 7 years ago
comment:30

Replying to @tscrim:

  • comment:15 I added a comment. Feel free to change.

Thanks; I did a second pass on it. Good to go.

  • comment:22 -1 as that is a little too close to enforcing elements being printed as sums, which is not what we want.

I see it as merely providing a sane default which works generically but can be overridden at will. Anyway, that's admittedly out of scope for this ticket; we can decide later on.

  • comment:23 I changed the file name to indexed_element as element is too generic for my tastes, but I left the class name the same.

Sounds good.

it was never imported into the global namespace. However, it was in the (public) doc, so it is good to have it.

Yup, exactly.

At this point, I think we should reference both and let the user decide.

Ok!

Out of curiosity, I am now going to have a quick look at the inheritance thing of comment:20. Other than that, it's good to go; thanks for all the hard work!

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from b13a870 to dec9bd6

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

dec9bd622632: make CombinatorialFreeModule.Element an alias (and not subclass) of IndexedFreeModuleElement
nthiery commented 7 years ago
comment:32

Advantage of the above commit:

Inconvenient:

I believe it's worth it. What do you think?

nthiery commented 7 years ago
comment:33

Running tests here ...

tscrim commented 7 years ago
comment:34

I am not so convinced as it hides a fault in a non-trivial way; it something that we should not need to do. I do agree it is more simple. Nevertheless, one should typically use a subclass of the CFM.Element when subclassing CFM and needing a new element class if they want to use the same data structures. Although I don't think there is too much difference between our two hacks.

Also, in some ways it almost looks like you are moving the unpickling to a completely unrelated class since

register_unpickle_override("sage.combinat.free_module",
                           "CombinatorialFreeModuleElement",
                           CombinatorialFreeModule.Element)

If you want the __setstate__ in IndexedFreeModuleElement, then this register_unpickle_override should redirect to that.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

666582222632: revert b13a870be: the trac link is in an error message!
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from dec9bd6 to 6665822

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 6665822 to 4a77584

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

076d7cb22632: revert b13a870be: the trac link is in an error message!
4a7758422632: let register_pickle_override refer to IndexedFreeModuleElement and move it close by
nthiery commented 7 years ago
comment:37

Replying to @tscrim:

I am not so convinced as it hides a fault in a non-trivial way; it something that we should not need to do. I do agree it is more simple. Nevertheless, one should typically use a subclass of the CFM.Element when subclassing CFM and needing a new element class if they want to use the same data structures. Although I don't think there is too much difference between our two hacks.

Ok. I agree, it's minor anyway.

Also, in some ways it almost looks like you are moving the unpickling to a completely unrelated class since

register_unpickle_override("sage.combinat.free_module",
                           "CombinatorialFreeModuleElement",
                           CombinatorialFreeModule.Element)

If you want the __setstate__ in IndexedFreeModuleElement, then this register_unpickle_override should redirect to that.

Good point. Fixed. This prompted me to move the register_unpickle_overide next to IndexedFreeModuleElement.

Sorry for the forced-push; I had screwed up my commit just before.

Altogether, I believe this is good to go!

nthiery commented 7 years ago

Reviewer: Nicolas M. Thiéry