sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.34k stars 454 forks source link

Speed improvements for categories with axioms #16296

Closed simon-king-jena closed 10 years ago

simon-king-jena commented 10 years ago

With the "axiomatic" approach towards categories introduced in #10963, we are now using join categories far more frequently than in the past. This involves many little operations (sorting, computation of index, ...) that should be made as fast as possible.

Hence, this ticket is about using Cython and optimizing basic algorithms in sage.categories.category and friends.

Depends on #10963 Depends on #15801 Depends on #16309

CC: @nthiery @mezzarobba

Component: categories

Keywords: cython performance categories

Author: Simon King

Branch/Commit: 5bad2ff

Reviewer: Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/16296

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:2

(curious)

simon-king-jena commented 10 years ago
comment:3

First observation: Moving sage.categories.category to Cython is not trivial, since it relies on abstract_method, which currently only accepts Python functions. Hence, abstract_method needs to be cythoned, too, or at least needs to be more permissive in the assertion that the input is types.FunctionType.

simon-king-jena commented 10 years ago
comment:4

Good: If one removes the assertion from abstract_method, then Sage starts.

simon-king-jena commented 10 years ago

Branch: u/SimonKing/ticket/16296

simon-king-jena commented 10 years ago
comment:7

I have pushed an initial branch (starting with #10963, of course), changing the assertion so that both types.FunctionType and cython_function_or_method are permitted (no idea how to test the latter more elegantly than by the name of the type...), and I changing AbstractMethod.__repr__ so that it gives comparable output for both Python and Cython methods.

Note: I don't think we should Cython AbstractMethod. After all, they are just placeholders for "real" methods and are thus not time-critical.

With the branch, Sage starts.


Last 10 new commits:

f262faatrac #16269: Merged with 6.2.rc1
338cc5btrac #16269: Reviewer's remarks
8ac32c2#16280: Fix call for FiniteEnumeratedSet's of plain Python objects
4e27454#16280: Trivial doctest fixes
f67d82etrac #16269: Merged with #16280
0a54b68Trac 16269: little improvements + line folding in the doctest output
b5ad803Trac 16269 (or follow up): optimize CartesianProduct._cartesian_product_of_elements
d2bc8b2Merge cartesian product functionalities from '#16269 and #16288' into categories/axioms-10963
e9c3586Some fixes to the docs and to comments in the code (review patch)
2cc9090Start with transitioning basic category code to Cython
simon-king-jena commented 10 years ago

Commit: 2cc9090

simon-king-jena commented 10 years ago
comment:8

Since #10963 is supposed to be merged together with #15801, which changes sage.categories.category, we should have #15801 as dependency. Pushing the next merge now...

simon-king-jena commented 10 years ago

Changed dependencies from #10963 to #10963, #15801

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

1ff993b#16275: trivial commit to use format
0074f02#16275: added doctest illustrating an unpickling where Hom is called on an uninitialized parent
c1e91f3#16275: update ticket number in the comments
ae3ca80This is handled as well by the generic of Homset.__reduce__ method,
507ee00Trac 16275: fixes for the previous commit; btw its title should have read: removed sage.modular.abvar.homspace.Homspace.__reduce__
6f55aa2Merge branch 't/16275/hom__introduce_a_check_argument_to_simplify_the_unpickling_detection_logic' into categories/over_a_base_category-15801
d6dc1fdMerge branch 'categories/axioms-10963' into categories/over_a_base_category-15801
54c3d67#15801: Initialize the base ring for module homsets
aa01591#15801: doctests for CategoryObjects.base_ring + docfix in Modules.SubcategoryMethods.base_ring
d42dc9cMerge branch 'public/categories/over-a-base-ring-category-15801' of git://trac.sagemath.org/sage into ticket/16296
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 2cc9090 to d42dc9c

simon-king-jena commented 10 years ago
comment:10

All tests in srs/sage/categories/ pass with the branch. Now, one can start to try and improve some operations...

simon-king-jena commented 10 years ago
comment:12

Some functions in c3_controlled can not be cdef'd, because they contain "any(...)" clauses. For example:

        for i in range(max_i): #from 0 <= i < max_i:
            O = heads[i]
            # Does O appear in none of the tails?
            O_key = key(O)
            if any(O_key in tailsets[j] for j in range(nbheads) if j != i):
                continue

It might be worth-while to see if the "any(...)" clause can be improved by something that is faster and does not use a clause. To the very least, I guess giving Cython some hints on the type (tailsets[j] is a set!) might pay off.

simon-king-jena commented 10 years ago
comment:13

Indeed:

def test1(list L):
    cdef int i,j
    cdef int max_i = len(L)
    for i in range(max_i):
        if any(100 in L[j] for j in range(max_i) if j!=i):
            continue

def test2(list L):
    cdef int i,j
    cdef int max_i = len(L)
    for i in range(max_i):
        if any(100 in <set>(L[j]) for j in range(max_i) if j!=i):
            continue

cpdef test3(list L):
    cdef int i,j
    cdef int max_i = len(L)
    cdef bint cont
    for i in range(max_i):
        cont = False
        for j from 0<=j<i:
            if 100 in <set>(L[j]):
                cont = True
                break
        if cont:
            continue
        for j from i<j<max_i:
            if 100 in <set>(L[j]):
                break

yields

sage: L = [set([randint(0,10000) for i in range(200)]) for j in range(200)]
sage: %timeit test1(L)
100 loops, best of 3: 2.5 ms per loop
sage: %timeit test2(L)
100 loops, best of 3: 2.5 ms per loop
sage: %timeit test3(L)
1000 loops, best of 3: 1.25 ms per loop

So, let's do it like this. I hope I am not mistaken that what is happening inside test3() is equivalent to the "any" clause in the other two functions.

simon-king-jena commented 10 years ago
comment:14

PS: 100 in <set>(L[j]) seems to have no advantage over 100 in L[j].

simon-king-jena commented 10 years ago
comment:15

Another closure occurs in the following definition:

cdef list tailsets = [set(key(O) for O in tail) for tail in tails]

Let's see how to make this a little faster without a closure:

def set_test1(list tails):
    cdef list tail
    return [set(O**2 for O in tail) for tail in tails]

cpdef list set_test2(list tails):
    cdef list tail
    cdef set part_set = set()
    cdef list result = []
    for tail in tails:
        part_set = set()
        for O in tail:
            part_set.add(O**2)
        result.append(part_set)
    return result

yields

sage: L = [[randint(0,10000) for i in range(200)] for j in range(200)]
sage: set_test1(L) == set_test2(L)
True
sage: %timeit set_test1(L)
100 loops, best of 3: 8.52 ms per loop
sage: %timeit set_test2(L)
100 loops, best of 3: 6.51 ms per loop

This time, we do not gain so much, but it is something, and we can make the function cpdef then, reducing the calling overhead after (c)importing it into another module.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from d42dc9c to 9a420ad

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

536dbcaSome cdef in _sort, _sort_uniq and join
7e07dedTrac 10963: Fixed crosslinks
250c8faMerge branch 'public/ticket/10963-doc-distributive' of git://trac.sagemath.org/sage into 10963
16b8e0fdocumentation edits to clarify that super_categories should not contain anything twice
bbfcbb8Merge Darij's commit in 'public/ticket/10963-doc-distributive' into categories/axioms-10963
f22508bTrac 10963: proofread Darij's edits
8da7522Trac 10963: degraded a bit a newly added cross-link to work around 'make doc-clean; make clean' failure
e640c02Merge the latest commits of branch 'ticket/10963' into ticket/16296
9a420adRemove closures in C3_sorted_merge by something faster, and make it cpdef
simon-king-jena commented 10 years ago
comment:17

With #15801, I get

sage: L = [Coalgebras(QQ), Sets().Finite(), Algebras(ZZ), SimplicialComplexes()]
sage: %timeit Category._sort_uniq(L)
10000 loops, best of 3: 25.2 µs per loop
sage: %timeit Category._sort(L)
100000 loops, best of 3: 6.22 µs per loop

With the current branch from here, I get

sage: L = [Coalgebras(QQ), Sets().Finite(), Algebras(ZZ), SimplicialComplexes()]
sage: %timeit Category._sort_uniq(L)
100000 loops, best of 3: 14.1 µs per loop
sage: %timeit Category._sort(L)
100000 loops, best of 3: 5.61 µs per loop

Making sage.categories.category Cython means that we could now cimport WeakValueDictionary and get a bit more speed by accessing the contents by cdef methods.

simon-king-jena commented 10 years ago
comment:18

Too bad: With the current branch,

sage: from sage.misc.sageinspect import sage_getargspec
sage: sage_getargspec(Category.join)
  File "<string>", line unknown
SyntaxError: Syntactical group starting with '(' did not end with ')'

Hence, the doc doesn't build. It seems that I need to create a dependency for this, improving sage_getargspec.

simon-king-jena commented 10 years ago

Changed dependencies from #10963, #15801 to #10963, #15801, #16309

simon-king-jena commented 10 years ago
comment:19

I created #16309 for the argspec issue.

simon-king-jena commented 10 years ago
comment:20

With #16309, the documentation builds, and

sage: from sage.misc.sageinspect import sage_getargspec
sage: sage_getargspec(Category.join)
ArgSpec(args=['categories', 'as_list', 'ignore_axioms', 'axioms'], varargs=None, keywords=None, defaults=(False, (), ()))

I am running tests now. However, even if they pass, I'd like to try and get some more speed for the WeakValueDictionary used in sage.categories.category, and sage.categories.category_with_axiom probably has more potential for improvements.

simon-king-jena commented 10 years ago
comment:21

Interestingly, with #16309 alone, the tests of sageinspect.py pass. But adding the commits from here, one test in sageinspect.py fails:

        sage: C = Rings()
        sage: HC = C.hom_category()
        sage: sage_getsourcelines(HC)
        (['    class HomCategory(HomCategory):\n', ...], ...)

is expected, but the indentation (that is due to the nesting) is wrong.

Actually it is even more wrong than that: The relevant class is defined as a nested class in sage.categories.rings, but we get

sage: from sage.misc.sageinspect import _sage_getdoc_unformatted
sage: _sage_getdoc_unformatted(HC)
"File: sage/categories/category.pyx (starting at line 2361)\n\n        Initializes this HomCategory\n\n        INPUT:\n         - ``category`` -- the category whose Homsets are the objects of this category.\n         - ``name`` -- An optional name for this category.\n\n        EXAMPLES:\n\n        We need to skip one test, since the hierarchy of hom categories isn't\n        consistent yet::\n\n            sage: C = sage.categories.category.HomCategory(Rings()); C\n            Category of hom sets in Category of rings\n            sage: TestSuite(C).run(skip=['_test_category_graph'])\n        "

That's all really strange. The code of the nested class is

    class HomCategory(HomCategory):
        pass

and indeed we have

sage: HC.__doc__

Why is _sage_getdoc_unformatted returning something nonempty, even though the class has empty __doc__?

simon-king-jena commented 10 years ago
comment:22

Now I get what is going wrong.

Since Rings.HomCategory has no docstring, _sage_getdoc_unformatted returns the docstring of Rings.HomCategory.__init__. This used to by a Python method, hence had no embedded information on the source code. But with this branch, __init__ is a Cython information and thus has embedding information, that unfortunately lets sage_getsourcelines look up the wrong file.

But in the first place: Why do we need to look up _sage_getdoc_unformatted when calling sage_getsourcelines? After all, Rings.HomCategory is a Python class in a Python file! Shouldn't it provide all relevant information?

simon-king-jena commented 10 years ago
comment:23

Probably the problem lies in the nesting:

sage: inspect.getsourcelines(HC.__class__)
---------------------------------------------------------------------------
IOError                                   Traceback (most recent call last)
<ipython-input-12-91ebc0139037> in <module>()
----> 1 inspect.getsourcelines(HC.__class__)

/home/king/Sage/git/sage/local/lib/python/inspect.pyc in getsourcelines(object)
    688     original source file the first line of code was found.  An IOError is
    689     raised if the source code cannot be retrieved."""
--> 690     lines, lnum = findsource(object)
    691 
    692     if ismodule(object): return lines, 0

/home/king/Sage/git/sage/local/lib/python2.7/site-packages/IPython/core/ultratb.pyc in findsource(object)
    189             return lines, candidates[0][1]
    190         else:
--> 191             raise IOError('could not find class definition')
    192 
    193     if ismethod(object):

IOError: could not find class definition
simon-king-jena commented 10 years ago
comment:24

With the newest version of #16309, the problem from comment:21 is fixed. So, back to (first) fixing doctests (if there is anything broken) and (second) improve speed further.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

55b73e1Fix sage_getargspec on static methods of Python classes in Cython code
32a374fFix parsing of Cython function definition in sageinspect
a0b8406Merge branch 'u/SimonKing/ticket/16309' of git://trac.sagemath.org/sage into ticket/16296
7877377Get source code for nested classes defined in Cython files
73afb99Merge branch 'ticket/16309' into ticket/16296
2ca0787Fix one test (new import location)
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 9a420ad to 2ca0787

simon-king-jena commented 10 years ago
comment:26

With the new commits from #16309 and one change in the import location of a doctest, all tests pass. But this is now not more than an intermediate stage, as I want to tweak sage.categories.category_with_axiom and want to improve weak cache access.

simon-king-jena commented 10 years ago
comment:27

Just for the record: The following is faster than uncamelcase.

cpdef inline default_match_splitter(match):
    cdef str g = match.group()
    return g[0]+'_'+g[1]

camel_splitter = re.compile("[a-z][A-Z]")

def new_uncamelcase(s, separator=None):
    if separator is None:
        return camel_splitter.sub(default_match_splitter, s).lower()
    return camel_splitter.sub(lambda match: match.group()[0]+separator+match.group()[1], s).lower()

However, since uncamelcase is rarely called, I will leave it.

simon-king-jena commented 10 years ago
comment:28

Or perhaps I should not leave it. I just notice that uncamelcase is also called (once) in Category._sort, and this is critical for speed.

simon-king-jena commented 10 years ago
comment:29

Replying to @simon-king-jena:

Or perhaps I should not leave it. I just notice that uncamelcase is also called (once) in Category._sort, and this is critical for speed.

Forget this comment. Apparently this is only because I tested an example involving "Finite".

nthiery commented 10 years ago
comment:30

Hi Simon!

Thanks so much for exploring how much we could potentially gain by cythonizing some of the category infrastructure code; we need this kind of facts to take good decisions. At this point definitely +1 on optimizing lower level things like c3, since the interface and algorithms are quite final.

On the other hand, I am wondering whether the current not so massive speed improvements make a compelling case for actually cythonizing the file category.py right now. The point is that cythonizing makes the code much harder to debug (I agree, this is a problem of pdb: we really would need to have a debugger supporting both python and cython code; alas I don't see progress in this direction anytime soon).

E.g. I find the coercion code very painful to work with precisely because of this. In fact, when I tried to fix things a couple years ago, I first uncythonized it ...

I fear that it would similarly slow down the followup development we want to do with categories, e.g. improving the join algorithm.

What do you think?

Cheers, Nicolas

simon-king-jena commented 10 years ago
comment:31

Replying to @nthiery:

On the other hand, I am wondering whether the current not so massive speed improvements make a compelling case for actually cythonizing the file category.py right now. The point is that cythonizing makes the code much harder to debug

Even if it is a Python class defined in a Cython file? I thought that problems will only start if things are cdef.

I fear that it would similarly slow down the followup development we want to do with categories, e.g. improving the join algorithm.

What do you think?

Actually improving the join algorithm speed-wise is my main goal for this ticket. I am not sure if much progress can be achieved without Cython.

I'd say: I try to improve the speed as much as possible by as much Cython as needed. In a second step, we can see if some of it can also be done in Python, or if some parts can be moved to a Cython file separate from sage.categories.category.

nthiery commented 10 years ago
comment:32

Replying to @simon-king-jena:

Even if it is a Python class defined in a Cython file?

That's kind of a shame, but yes, even in this case. Put this in test_debug_cython.pyx:

import pdb
def f(x):
    for i in range(10):
        pdb.set_trace()
        x = x + 1

and run

sage: load("test_debug_cython.pyx")
Compiling ./test_debug_cython.pyx...
sage: f(3)

The function f does not even appear in the backtrace.

Actually improving the join algorithm speed-wise is my main goal for this ticket. I am not sure if much progress can be achieved without Cython.

The upcoming work is not only about speed, but also functionality. Or trying alternative algorithms, like your pet boolean polynomial ideals approach. And even speedwise, there may be room to improve the algorithm itself, rather than the implementation.

I'd say: I try to improve the speed as much as possible by as much Cython as needed. In a second step, we can see if some of it can also be done in Python, or

Definitely worth trying hard and see how much we can gain. And see if it's worth making the code more complex / harder to trace.

if some parts can be moved to a Cython file separate from sage.categories.category.

This could be a good approach indeed.

Cheers, Nicolas

simon-king-jena commented 10 years ago
comment:33

As I have indicated on #10963, one could replace the tuple all_axioms by a container derived from dict. It is used in canonicalise_axioms, which can be accelerated by 35%.

simon-king-jena commented 10 years ago
comment:34

43% actually...

nthiery commented 10 years ago
comment:35

Replying to @simon-king-jena:

As I have indicated on #10963, one could replace the tuple all_axioms by a container derived from dict. It is used in canonicalise_axioms, which can be accelerated by 35%.

+1!

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 2ca0787 to 007cccd

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

007cccdImprove canonicalize_axioms by using a faster container for axioms
simon-king-jena commented 10 years ago
comment:37

With #15801:

sage: from sage.categories.category_with_axiom import canonicalize_axioms
sage: L = ["Commutative", "Connected", "Commutative", "WithBasis", "Finite"]
sage: %timeit canonicalize_axioms(L)
100000 loops, best of 3: 8.23 µs per loop
sage: L = [Coalgebras(QQ), Sets().Finite(), Algebras(ZZ), SimplicialComplexes()]
sage: %timeit Category._sort(L)
100000 loops, best of 3: 6.12 µs per loop
sage: %timeit Category._sort_uniq(L)
10000 loops, best of 3: 24.7 µs per loop
sage: %timeit Category.join(L)
10000 loops, best of 3: 854 µs per loop

With the branch from here:

sage: from sage.categories.category_with_axiom import canonicalize_axioms
sage: L = ["Commutative", "Connected", "Commutative", "WithBasis", "Finite"]
sage: %timeit canonicalize_axioms(L)
100000 loops, best of 3: 3.77 µs per loop
sage: L = [Coalgebras(QQ), Sets().Finite(), Algebras(ZZ), SimplicialComplexes()]
sage: %timeit Category._sort(L)
100000 loops, best of 3: 5.49 µs per loop
sage: %timeit Category._sort_uniq(L)
100000 loops, best of 3: 13.8 µs per loop
sage: %timeit Category.join(L)
10000 loops, best of 3: 659 µs per loop

So, there is an improvement of >50% in canonicalize_axioms, 44% in _sort_uniq (but not in _sort), and 22% in join.

nthiery commented 10 years ago
comment:38

Nice. How much would you say is due to the "cythonization" of all_axioms and friends, and how much to the cythonization of the full file?

simon-king-jena commented 10 years ago
comment:39

According to grep, Category._sort() is not really used. So, would you mind if I remove _sort(), and only replace _sort_uniq() by a Cython function?

Best regards, Simon

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

d7f8250Stick with Python for category/category_with_axiom, use cythoned helper functions
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 007cccd to d7f8250

simon-king-jena commented 10 years ago
comment:41

I switched sage.categories.category and sage.categories.category_with_axiom back to Python. Instead, I created a module with helper functions. Fortunately, _sort_uniq is a static method, and thus it is no problem to replace it with a cpdef function imported from the helper module.

The current timings:

sage: from sage.categories.category_with_axiom import canonicalize_axioms, all_axioms
sage: canonicalize_axioms(all_axioms, ["Commutative", "Connected", "Commutative", "WithBasis", "Finite"])
('Finite', 'Connected', 'WithBasis', 'Commutative')
sage: %timeit canonicalize_axioms(all_axioms, L)
100000 loops, best of 3: 3.83 µs per loop
sage: L = [Coalgebras(QQ), Sets().Finite(), Algebras(ZZ), SimplicialComplexes()]
sage: %timeit Category._sort(L)
100000 loops, best of 3: 6.25 µs per loop
sage: %timeit Category._sort_uniq(L)
100000 loops, best of 3: 15.9 µs per loop
sage: %timeit Category.join(L)
10000 loops, best of 3: 816 µs per loop

Note: Since I want to keep all_axioms in sage.categories.category_with_axiom, I need to provide it to canonicalize_axioms (now defined in the helper module) as an argument.

Evaluation

Concerning join

Perhaps one could create another helper function for the join. It is static, hence, a cpdef function won't be a problem. However, it uses a cache. This cache should be local to sage.categories.category.Category, but when defining the join in the helper module, we'd like to have cython-speed access to it. OTOH, we could define the cache in the helper module, and then assign it as an attribute to Category, if we think that the cache is needed outside of the helper.

simon-king-jena commented 10 years ago
comment:42

Something more about the join:

I think this would be a good compromise between cythonisation and "keeping the function where it belongs". In particular, the documentation would stay in Categories.join. Since _sort, _sort_uniq and _flatten_categories are not part of the docs anyway (or am I mistaken?), I think there should be no problem to move these over to the helper module.

simon-king-jena commented 10 years ago
comment:43

Nicolas, can you comment on this. In the join construction, I see the line

        todo.update( (category, axiom)
                     for axiom in axioms.difference(axs) )

Since category is fixed, wouldn't this mean to just set todo[category] to the last axiom in axioms.difference(axs)? Is this a bug, or intended?

simon-king-jena commented 10 years ago
comment:44

Replying to @simon-king-jena:

Nicolas, can you comment on this. In the join construction, I see the line

        todo.update( (category, axiom)
                     for axiom in axioms.difference(axs) )

Since category is fixed, wouldn't this mean to just set todo[category] to the last axiom in axioms.difference(axs)? Is this a bug, or intended?

Oooops, sorry, todo is a set, not a dict. So, everything is alright...

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

df486b8Move most of Category.join to category_cy_helper.pyx